diff --git "a/pretrained/2023-12-12-mace-128-L1.log" "b/pretrained/2023-12-12-mace-128-L1.log" new file mode 100644--- /dev/null +++ "b/pretrained/2023-12-12-mace-128-L1.log" @@ -0,0 +1,6277 @@ +2023-12-02 22:41:23.048 INFO: Process group initialized: True +2023-12-02 22:41:23.050 INFO: Processes: 80 +2023-12-02 22:41:23.050 INFO: MACE version: 0.3.0 +2023-12-02 22:41:23.050 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-02 22:41:23.050 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-02 22:41:23.051 INFO: Using statistics json file +2023-12-02 22:41:23.051 INFO: Using atomic numbers from statistics file +2023-12-02 22:41:23.051 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-02 22:41:23.051 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-02 22:41:23.052 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-02 22:42:00.036 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-02 22:42:00.039 INFO: Average number of neighbors: 61.964672446250916 +2023-12-02 22:42:00.039 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-02 22:42:00.039 INFO: Building model +2023-12-02 22:42:00.040 INFO: Hidden irreps: 128x0e+128x1o +2023-12-02 22:42:04.402 WARNING: Cannot find checkpoint with tag '03-faster-02_run-1' in 'checkpoints' +2023-12-02 22:42:04.409 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-02 22:42:04.417 INFO: Number of parameters: 4688656 +2023-12-02 22:42:04.417 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-02 22:42:04.417 INFO: Using Weights and Biases for logging +2023-12-02 22:42:16.790 INFO: Using gradient clipping with tolerance=100.000 +2023-12-02 22:42:16.790 INFO: Started training +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.925 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.926 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 22:42:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-02 23:00:25.212 INFO: Epoch 0: loss=1.5651e-02, MAE_E_per_atom=189.5788 meV, MAE_F=114.2472 meV / A, MAE_stress_per_atom=0.3515 meV / A^3 +2023-12-02 23:10:30.454 INFO: Epoch 1: loss=1.3104e-02, MAE_E_per_atom=125.0212 meV, MAE_F=106.5669 meV / A, MAE_stress_per_atom=0.3153 meV / A^3 +2023-12-02 23:20:30.993 INFO: Epoch 2: loss=1.1658e-02, MAE_E_per_atom=87.9454 meV, MAE_F=98.6289 meV / A, MAE_stress_per_atom=0.3011 meV / A^3 +2023-12-02 23:30:34.418 INFO: Epoch 3: loss=1.0874e-02, MAE_E_per_atom=76.8646 meV, MAE_F=92.1640 meV / A, MAE_stress_per_atom=0.2613 meV / A^3 +2023-12-02 23:40:35.783 INFO: Epoch 4: loss=1.0286e-02, MAE_E_per_atom=69.6442 meV, MAE_F=86.7263 meV / A, MAE_stress_per_atom=0.2015 meV / A^3 +2023-12-02 23:50:37.083 INFO: Epoch 5: loss=9.7437e-03, MAE_E_per_atom=64.3483 meV, MAE_F=82.4910 meV / A, MAE_stress_per_atom=0.1838 meV / A^3 +2023-12-03 00:00:34.875 INFO: Epoch 6: loss=9.2353e-03, MAE_E_per_atom=59.1599 meV, MAE_F=79.2014 meV / A, MAE_stress_per_atom=0.1655 meV / A^3 +2023-12-03 00:10:34.652 INFO: Epoch 7: loss=8.5581e-03, MAE_E_per_atom=53.3720 meV, MAE_F=74.8991 meV / A, MAE_stress_per_atom=0.1428 meV / A^3 +2023-12-03 00:20:34.670 INFO: Epoch 8: loss=8.2653e-03, MAE_E_per_atom=50.2164 meV, MAE_F=72.8951 meV / A, MAE_stress_per_atom=0.1278 meV / A^3 +2023-12-03 00:30:49.865 INFO: Epoch 9: loss=8.2057e-03, MAE_E_per_atom=48.7847 meV, MAE_F=72.1514 meV / A, MAE_stress_per_atom=0.1517 meV / A^3 +2023-12-03 21:04:36.968 INFO: Process group initialized: True +2023-12-03 21:04:36.984 INFO: Processes: 80 +2023-12-03 21:04:36.984 INFO: MACE version: 0.3.0 +2023-12-03 21:04:36.984 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-03 21:04:36.984 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-03 21:04:36.984 INFO: Using statistics json file +2023-12-03 21:04:36.984 INFO: Using atomic numbers from statistics file +2023-12-03 21:04:36.985 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-03 21:04:36.985 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-03 21:04:36.985 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-03 21:05:09.406 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-03 21:05:09.409 INFO: Average number of neighbors: 61.964672446250916 +2023-12-03 21:05:09.409 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-03 21:05:09.409 INFO: Building model +2023-12-03 21:05:09.410 INFO: Hidden irreps: 128x0e+128x1o +2023-12-03 21:05:13.528 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-03 21:05:13.529 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-9.pt +2023-12-03 21:05:13.752 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-03 21:05:13.759 INFO: Number of parameters: 4688656 +2023-12-03 21:05:13.759 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-03 21:05:13.760 INFO: Using Weights and Biases for logging +2023-12-03 21:05:27.807 INFO: Using gradient clipping with tolerance=100.000 +2023-12-03 21:05:27.807 INFO: Started training +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:05:35.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 21:23:30.906 INFO: Epoch 9: loss=7.8100e-03, MAE_E_per_atom=44.9880 meV, MAE_F=69.6055 meV / A, MAE_stress_per_atom=0.1323 meV / A^3 +2023-12-03 21:33:35.757 INFO: Epoch 10: loss=7.9405e-03, MAE_E_per_atom=46.7520 meV, MAE_F=71.7089 meV / A, MAE_stress_per_atom=0.1274 meV / A^3 +2023-12-03 21:43:37.307 INFO: Epoch 11: loss=7.5326e-03, MAE_E_per_atom=41.9432 meV, MAE_F=67.7979 meV / A, MAE_stress_per_atom=0.1219 meV / A^3 +2023-12-03 21:53:38.332 INFO: Epoch 12: loss=7.3450e-03, MAE_E_per_atom=40.0859 meV, MAE_F=66.3809 meV / A, MAE_stress_per_atom=0.1234 meV / A^3 +2023-12-03 22:03:41.154 INFO: Epoch 13: loss=7.3362e-03, MAE_E_per_atom=38.5955 meV, MAE_F=65.9189 meV / A, MAE_stress_per_atom=0.1193 meV / A^3 +2023-12-03 22:13:43.325 INFO: Epoch 14: loss=7.2844e-03, MAE_E_per_atom=37.4010 meV, MAE_F=64.7116 meV / A, MAE_stress_per_atom=0.1236 meV / A^3 +2023-12-03 22:23:45.951 INFO: Epoch 15: loss=7.2545e-03, MAE_E_per_atom=35.6112 meV, MAE_F=64.3218 meV / A, MAE_stress_per_atom=0.1291 meV / A^3 +2023-12-03 22:33:52.083 INFO: Epoch 16: loss=7.3025e-03, MAE_E_per_atom=35.3851 meV, MAE_F=65.1114 meV / A, MAE_stress_per_atom=0.1240 meV / A^3 +2023-12-03 22:43:54.892 INFO: Epoch 17: loss=7.1144e-03, MAE_E_per_atom=34.3925 meV, MAE_F=63.0793 meV / A, MAE_stress_per_atom=0.1237 meV / A^3 +2023-12-03 22:53:57.565 INFO: Epoch 18: loss=7.0373e-03, MAE_E_per_atom=33.8412 meV, MAE_F=61.8664 meV / A, MAE_stress_per_atom=0.1309 meV / A^3 +2023-12-03 23:28:30.526 INFO: Process group initialized: True +2023-12-03 23:28:30.528 INFO: Processes: 80 +2023-12-03 23:28:30.528 INFO: MACE version: 0.3.0 +2023-12-03 23:28:30.528 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-03 23:28:30.528 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-03 23:28:30.529 INFO: Using statistics json file +2023-12-03 23:28:30.529 INFO: Using atomic numbers from statistics file +2023-12-03 23:28:30.529 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-03 23:28:30.530 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-03 23:28:30.530 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-03 23:29:02.593 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-03 23:29:02.595 INFO: Average number of neighbors: 61.964672446250916 +2023-12-03 23:29:02.595 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-03 23:29:02.596 INFO: Building model +2023-12-03 23:29:02.597 INFO: Hidden irreps: 128x0e+128x1o +2023-12-03 23:29:06.140 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-03 23:29:06.141 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-18.pt +2023-12-03 23:29:06.356 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-03 23:29:06.363 INFO: Number of parameters: 4688656 +2023-12-03 23:29:06.363 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-03 23:29:06.363 INFO: Using Weights and Biases for logging +2023-12-03 23:29:19.645 INFO: Using gradient clipping with tolerance=100.000 +2023-12-03 23:29:19.645 INFO: Started training +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.230 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:29:27.231 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-03 23:47:43.452 INFO: Epoch 18: loss=7.0010e-03, MAE_E_per_atom=32.9507 meV, MAE_F=61.7643 meV / A, MAE_stress_per_atom=0.1346 meV / A^3 +2023-12-03 23:57:49.010 INFO: Epoch 19: loss=7.0149e-03, MAE_E_per_atom=32.4652 meV, MAE_F=60.9637 meV / A, MAE_stress_per_atom=0.1360 meV / A^3 +2023-12-04 00:07:46.787 INFO: Epoch 20: loss=7.0039e-03, MAE_E_per_atom=31.4488 meV, MAE_F=60.6937 meV / A, MAE_stress_per_atom=0.1412 meV / A^3 +2023-12-04 00:17:49.221 INFO: Epoch 21: loss=6.9376e-03, MAE_E_per_atom=30.8642 meV, MAE_F=60.1096 meV / A, MAE_stress_per_atom=0.1389 meV / A^3 +2023-12-04 00:27:51.080 INFO: Epoch 22: loss=6.7749e-03, MAE_E_per_atom=30.3103 meV, MAE_F=59.7506 meV / A, MAE_stress_per_atom=0.1379 meV / A^3 +2023-12-04 00:38:03.814 INFO: Epoch 23: loss=6.8292e-03, MAE_E_per_atom=29.7779 meV, MAE_F=58.8947 meV / A, MAE_stress_per_atom=0.1446 meV / A^3 +2023-12-04 00:48:18.758 INFO: Epoch 24: loss=6.7016e-03, MAE_E_per_atom=29.9185 meV, MAE_F=58.4956 meV / A, MAE_stress_per_atom=0.1408 meV / A^3 +2023-12-04 00:58:20.979 INFO: Epoch 25: loss=6.6102e-03, MAE_E_per_atom=29.2495 meV, MAE_F=58.1956 meV / A, MAE_stress_per_atom=0.1393 meV / A^3 +2023-12-04 01:08:21.927 INFO: Epoch 26: loss=6.6502e-03, MAE_E_per_atom=29.3362 meV, MAE_F=58.4724 meV / A, MAE_stress_per_atom=0.1378 meV / A^3 +2023-12-04 01:18:21.812 INFO: Epoch 27: loss=6.4695e-03, MAE_E_per_atom=28.6180 meV, MAE_F=56.5232 meV / A, MAE_stress_per_atom=0.1392 meV / A^3 +2023-12-04 01:33:40.634 INFO: Process group initialized: True +2023-12-04 01:33:40.636 INFO: Processes: 80 +2023-12-04 01:33:40.636 INFO: MACE version: 0.3.0 +2023-12-04 01:33:40.636 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-04 01:33:40.636 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-04 01:33:40.637 INFO: Using statistics json file +2023-12-04 01:33:40.637 INFO: Using atomic numbers from statistics file +2023-12-04 01:33:40.637 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-04 01:33:40.637 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-04 01:33:40.638 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-04 01:34:14.531 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-04 01:34:14.533 INFO: Average number of neighbors: 61.964672446250916 +2023-12-04 01:34:14.534 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-04 01:34:14.534 INFO: Building model +2023-12-04 01:34:14.535 INFO: Hidden irreps: 128x0e+128x1o +2023-12-04 01:34:18.833 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-04 01:34:18.834 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-27.pt +2023-12-04 01:34:19.055 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-04 01:34:19.062 INFO: Number of parameters: 4688656 +2023-12-04 01:34:19.062 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-04 01:34:19.062 INFO: Using Weights and Biases for logging +2023-12-04 01:34:31.290 INFO: Using gradient clipping with tolerance=100.000 +2023-12-04 01:34:31.290 INFO: Started training +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:34:39.630 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 01:52:49.290 INFO: Epoch 27: loss=6.3330e-03, MAE_E_per_atom=28.1715 meV, MAE_F=56.2650 meV / A, MAE_stress_per_atom=0.1375 meV / A^3 +2023-12-04 02:02:54.797 INFO: Epoch 28: loss=6.3484e-03, MAE_E_per_atom=28.3219 meV, MAE_F=55.9710 meV / A, MAE_stress_per_atom=0.1396 meV / A^3 +2023-12-04 02:12:57.130 INFO: Epoch 29: loss=6.2543e-03, MAE_E_per_atom=27.4718 meV, MAE_F=55.6076 meV / A, MAE_stress_per_atom=0.1455 meV / A^3 +2023-12-04 02:22:57.553 INFO: Epoch 30: loss=6.2856e-03, MAE_E_per_atom=27.8037 meV, MAE_F=55.8452 meV / A, MAE_stress_per_atom=0.1336 meV / A^3 +2023-12-04 02:33:04.883 INFO: Epoch 31: loss=6.2419e-03, MAE_E_per_atom=27.4773 meV, MAE_F=55.8558 meV / A, MAE_stress_per_atom=0.1352 meV / A^3 +2023-12-04 02:43:07.839 INFO: Epoch 32: loss=6.2419e-03, MAE_E_per_atom=27.4885 meV, MAE_F=55.4263 meV / A, MAE_stress_per_atom=0.1428 meV / A^3 +2023-12-04 02:53:14.647 INFO: Epoch 33: loss=6.1621e-03, MAE_E_per_atom=27.8727 meV, MAE_F=54.5352 meV / A, MAE_stress_per_atom=0.1439 meV / A^3 +2023-12-04 03:03:19.230 INFO: Epoch 34: loss=6.1170e-03, MAE_E_per_atom=27.6316 meV, MAE_F=54.7713 meV / A, MAE_stress_per_atom=0.1315 meV / A^3 +2023-12-04 03:13:19.561 INFO: Epoch 35: loss=6.1370e-03, MAE_E_per_atom=27.5249 meV, MAE_F=54.6406 meV / A, MAE_stress_per_atom=0.1346 meV / A^3 +2023-12-04 03:23:25.816 INFO: Epoch 36: loss=6.0787e-03, MAE_E_per_atom=27.5750 meV, MAE_F=54.2939 meV / A, MAE_stress_per_atom=0.1350 meV / A^3 +2023-12-04 04:17:07.425 INFO: Process group initialized: True +2023-12-04 04:17:07.427 INFO: Processes: 80 +2023-12-04 04:17:07.428 INFO: MACE version: 0.3.0 +2023-12-04 04:17:07.428 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-04 04:17:07.428 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-04 04:17:07.428 INFO: Using statistics json file +2023-12-04 04:17:07.428 INFO: Using atomic numbers from statistics file +2023-12-04 04:17:07.429 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-04 04:17:07.429 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-04 04:17:07.429 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-04 04:17:40.443 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-04 04:17:40.446 INFO: Average number of neighbors: 61.964672446250916 +2023-12-04 04:17:40.446 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-04 04:17:40.446 INFO: Building model +2023-12-04 04:17:40.447 INFO: Hidden irreps: 128x0e+128x1o +2023-12-04 04:17:44.572 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-04 04:17:44.573 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-36.pt +2023-12-04 04:17:44.801 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-04 04:17:44.807 INFO: Number of parameters: 4688656 +2023-12-04 04:17:44.807 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-04 04:17:44.807 INFO: Using Weights and Biases for logging +2023-12-04 04:17:59.477 INFO: Using gradient clipping with tolerance=100.000 +2023-12-04 04:17:59.477 INFO: Started training +2023-12-04 04:18:07.071 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.071 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.071 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.071 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.071 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.071 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.071 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.071 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.072 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:18:07.073 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-04 04:36:23.985 INFO: Epoch 36: loss=6.0989e-03, MAE_E_per_atom=27.4604 meV, MAE_F=54.1852 meV / A, MAE_stress_per_atom=0.1353 meV / A^3 +2023-12-04 04:46:34.451 INFO: Epoch 37: loss=6.0140e-03, MAE_E_per_atom=27.3548 meV, MAE_F=54.0183 meV / A, MAE_stress_per_atom=0.1375 meV / A^3 +2023-12-04 04:56:39.556 INFO: Epoch 38: loss=5.9090e-03, MAE_E_per_atom=26.7908 meV, MAE_F=53.5385 meV / A, MAE_stress_per_atom=0.1480 meV / A^3 +2023-12-04 05:06:41.455 INFO: Epoch 39: loss=6.6777e-03, MAE_E_per_atom=32.2805 meV, MAE_F=62.8859 meV / A, MAE_stress_per_atom=0.1267 meV / A^3 +2023-12-04 05:16:42.397 INFO: Epoch 40: loss=5.9339e-03, MAE_E_per_atom=28.0683 meV, MAE_F=55.0164 meV / A, MAE_stress_per_atom=0.1259 meV / A^3 +2023-12-04 05:46:16.916 INFO: Epoch 41: loss=5.8508e-03, MAE_E_per_atom=27.3653 meV, MAE_F=54.1171 meV / A, MAE_stress_per_atom=0.1259 meV / A^3 +2023-12-04 05:57:24.512 INFO: Epoch 42: loss=5.7785e-03, MAE_E_per_atom=27.1894 meV, MAE_F=53.2652 meV / A, MAE_stress_per_atom=0.1271 meV / A^3 +2023-12-04 06:07:25.615 INFO: Epoch 43: loss=5.7432e-03, MAE_E_per_atom=26.4269 meV, MAE_F=52.9813 meV / A, MAE_stress_per_atom=0.1329 meV / A^3 +2023-12-06 23:02:37.048 INFO: Process group initialized: True +2023-12-06 23:02:37.049 INFO: Processes: 80 +2023-12-06 23:02:37.050 INFO: MACE version: 0.3.0 +2023-12-06 23:02:37.050 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-06 23:02:37.050 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-06 23:02:37.050 INFO: Using statistics json file +2023-12-06 23:02:37.050 INFO: Using atomic numbers from statistics file +2023-12-06 23:02:37.050 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-06 23:02:37.050 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-06 23:02:37.051 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-06 23:03:08.807 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-06 23:03:08.810 INFO: Average number of neighbors: 61.964672446250916 +2023-12-06 23:03:08.810 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-06 23:03:08.810 INFO: Building model +2023-12-06 23:03:08.811 INFO: Hidden irreps: 128x0e+128x1o +2023-12-06 23:03:12.904 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-06 23:03:12.906 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-43.pt +2023-12-06 23:03:13.130 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-06 23:03:13.136 INFO: Number of parameters: 4688656 +2023-12-06 23:03:13.136 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-06 23:03:13.136 INFO: Using Weights and Biases for logging +2023-12-06 23:03:50.458 INFO: Using gradient clipping with tolerance=100.000 +2023-12-06 23:03:50.459 INFO: Started training +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.216 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:03:59.217 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-06 23:22:13.491 INFO: Epoch 43: loss=5.7546e-03, MAE_E_per_atom=26.0365 meV, MAE_F=52.7776 meV / A, MAE_stress_per_atom=0.1360 meV / A^3 +2023-12-06 23:32:19.755 INFO: Epoch 44: loss=5.7668e-03, MAE_E_per_atom=26.4641 meV, MAE_F=52.4968 meV / A, MAE_stress_per_atom=0.1351 meV / A^3 +2023-12-06 23:42:22.814 INFO: Epoch 45: loss=5.7033e-03, MAE_E_per_atom=26.0215 meV, MAE_F=52.5104 meV / A, MAE_stress_per_atom=0.1362 meV / A^3 +2023-12-06 23:52:24.711 INFO: Epoch 46: loss=5.7623e-03, MAE_E_per_atom=25.6253 meV, MAE_F=52.2065 meV / A, MAE_stress_per_atom=0.1378 meV / A^3 +2023-12-07 00:02:25.132 INFO: Epoch 47: loss=5.6769e-03, MAE_E_per_atom=25.6294 meV, MAE_F=52.0382 meV / A, MAE_stress_per_atom=0.1346 meV / A^3 +2023-12-07 00:12:27.628 INFO: Epoch 48: loss=5.7192e-03, MAE_E_per_atom=25.5128 meV, MAE_F=52.0813 meV / A, MAE_stress_per_atom=0.1380 meV / A^3 +2023-12-07 00:22:27.343 INFO: Epoch 49: loss=5.6648e-03, MAE_E_per_atom=25.2875 meV, MAE_F=52.0400 meV / A, MAE_stress_per_atom=0.1319 meV / A^3 +2023-12-07 00:32:27.240 INFO: Epoch 50: loss=5.7291e-03, MAE_E_per_atom=25.4574 meV, MAE_F=52.0279 meV / A, MAE_stress_per_atom=0.1443 meV / A^3 +2023-12-07 00:42:30.227 INFO: Epoch 51: loss=5.6172e-03, MAE_E_per_atom=25.0846 meV, MAE_F=51.4949 meV / A, MAE_stress_per_atom=0.1352 meV / A^3 +2023-12-07 00:52:30.835 INFO: Epoch 52: loss=5.5755e-03, MAE_E_per_atom=25.0800 meV, MAE_F=50.9347 meV / A, MAE_stress_per_atom=0.1397 meV / A^3 +2023-12-08 14:53:01.415 INFO: Process group initialized: True +2023-12-08 14:53:01.417 INFO: Processes: 80 +2023-12-08 14:53:01.417 INFO: MACE version: 0.3.0 +2023-12-08 14:53:01.418 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-08 14:53:01.418 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-08 14:53:01.418 INFO: Using statistics json file +2023-12-08 14:53:01.418 INFO: Using atomic numbers from statistics file +2023-12-08 14:53:01.418 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-08 14:53:01.419 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-08 14:53:01.419 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-08 14:53:34.643 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-08 14:53:34.645 INFO: Average number of neighbors: 61.964672446250916 +2023-12-08 14:53:34.645 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-08 14:53:34.645 INFO: Building model +2023-12-08 14:53:34.646 INFO: Hidden irreps: 128x0e+128x1o +2023-12-08 14:53:38.862 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-08 14:53:38.863 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-52.pt +2023-12-08 14:53:39.102 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-08 14:53:39.108 INFO: Number of parameters: 4688656 +2023-12-08 14:53:39.109 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-08 14:53:39.109 INFO: Using Weights and Biases for logging +2023-12-08 14:53:52.010 INFO: Using gradient clipping with tolerance=100.000 +2023-12-08 14:53:52.011 INFO: Started training +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.622 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 14:53:59.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-08 15:11:59.053 INFO: Epoch 52: loss=5.5970e-03, MAE_E_per_atom=25.2434 meV, MAE_F=50.9796 meV / A, MAE_stress_per_atom=0.1427 meV / A^3 +2023-12-08 15:22:04.028 INFO: Epoch 53: loss=5.5956e-03, MAE_E_per_atom=24.8326 meV, MAE_F=50.9970 meV / A, MAE_stress_per_atom=0.1453 meV / A^3 +2023-12-08 15:32:03.693 INFO: Epoch 54: loss=5.5637e-03, MAE_E_per_atom=24.3584 meV, MAE_F=50.8269 meV / A, MAE_stress_per_atom=0.1410 meV / A^3 +2023-12-08 15:42:03.343 INFO: Epoch 55: loss=6.0015e-03, MAE_E_per_atom=25.6584 meV, MAE_F=53.7640 meV / A, MAE_stress_per_atom=0.1393 meV / A^3 +2023-12-08 15:52:03.739 INFO: Epoch 56: loss=5.6626e-03, MAE_E_per_atom=25.4845 meV, MAE_F=52.0257 meV / A, MAE_stress_per_atom=0.1321 meV / A^3 +2023-12-08 16:02:10.482 INFO: Epoch 57: loss=5.5067e-03, MAE_E_per_atom=25.0692 meV, MAE_F=50.8835 meV / A, MAE_stress_per_atom=0.1430 meV / A^3 +2023-12-08 16:12:10.498 INFO: Epoch 58: loss=5.5490e-03, MAE_E_per_atom=25.0525 meV, MAE_F=50.2648 meV / A, MAE_stress_per_atom=0.1398 meV / A^3 +2023-12-08 16:22:12.121 INFO: Epoch 59: loss=5.5690e-03, MAE_E_per_atom=24.6609 meV, MAE_F=50.9579 meV / A, MAE_stress_per_atom=0.1524 meV / A^3 +2023-12-08 16:32:11.050 INFO: Epoch 60: loss=5.5100e-03, MAE_E_per_atom=24.5883 meV, MAE_F=50.3146 meV / A, MAE_stress_per_atom=0.1369 meV / A^3 +2023-12-08 16:42:09.117 INFO: Epoch 61: loss=5.4125e-03, MAE_E_per_atom=24.4680 meV, MAE_F=50.0029 meV / A, MAE_stress_per_atom=0.1363 meV / A^3 +2023-12-09 06:35:46.895 INFO: Process group initialized: True +2023-12-09 06:35:46.897 INFO: Processes: 80 +2023-12-09 06:35:46.897 INFO: MACE version: 0.3.0 +2023-12-09 06:35:46.897 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-09 06:35:46.897 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-09 06:35:46.898 INFO: Using statistics json file +2023-12-09 06:35:46.898 INFO: Using atomic numbers from statistics file +2023-12-09 06:35:46.899 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-09 06:35:46.899 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-09 06:35:46.899 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-09 06:36:18.541 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-09 06:36:18.544 INFO: Average number of neighbors: 61.964672446250916 +2023-12-09 06:36:18.544 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-09 06:36:18.544 INFO: Building model +2023-12-09 06:36:18.545 INFO: Hidden irreps: 128x0e+128x1o +2023-12-09 06:36:22.661 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-09 06:36:22.663 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-61.pt +2023-12-09 06:36:22.889 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-09 06:36:22.895 INFO: Number of parameters: 4688656 +2023-12-09 06:36:22.895 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-09 06:36:22.895 INFO: Using Weights and Biases for logging +2023-12-09 06:36:36.514 INFO: Using gradient clipping with tolerance=100.000 +2023-12-09 06:36:36.514 INFO: Started training +2023-12-09 06:36:44.006 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.006 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.006 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.006 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.012 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.016 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:36:44.017 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 06:54:49.898 INFO: Epoch 61: loss=5.4036e-03, MAE_E_per_atom=24.3160 meV, MAE_F=49.9340 meV / A, MAE_stress_per_atom=0.1353 meV / A^3 +2023-12-09 07:04:23.211 INFO: Epoch 62: loss=5.5799e-03, MAE_E_per_atom=24.9227 meV, MAE_F=51.2300 meV / A, MAE_stress_per_atom=0.1377 meV / A^3 +2023-12-09 07:13:53.489 INFO: Epoch 63: loss=5.6230e-03, MAE_E_per_atom=25.0577 meV, MAE_F=51.2777 meV / A, MAE_stress_per_atom=0.1575 meV / A^3 +2023-12-09 07:23:22.736 INFO: Epoch 64: loss=5.4196e-03, MAE_E_per_atom=24.3967 meV, MAE_F=50.1036 meV / A, MAE_stress_per_atom=0.1323 meV / A^3 +2023-12-09 07:32:52.768 INFO: Epoch 65: loss=5.4084e-03, MAE_E_per_atom=24.1947 meV, MAE_F=49.7151 meV / A, MAE_stress_per_atom=0.1398 meV / A^3 +2023-12-09 07:42:23.335 INFO: Epoch 66: loss=5.4595e-03, MAE_E_per_atom=23.8951 meV, MAE_F=49.9054 meV / A, MAE_stress_per_atom=0.1307 meV / A^3 +2023-12-09 07:51:52.145 INFO: Epoch 67: loss=5.4229e-03, MAE_E_per_atom=23.9051 meV, MAE_F=49.3330 meV / A, MAE_stress_per_atom=0.1391 meV / A^3 +2023-12-09 08:01:25.829 INFO: Epoch 68: loss=5.3799e-03, MAE_E_per_atom=23.7766 meV, MAE_F=49.4012 meV / A, MAE_stress_per_atom=0.1339 meV / A^3 +2023-12-09 08:10:55.065 INFO: Epoch 69: loss=5.3365e-03, MAE_E_per_atom=23.9563 meV, MAE_F=49.3292 meV / A, MAE_stress_per_atom=0.1394 meV / A^3 +2023-12-09 08:20:23.296 INFO: Epoch 70: loss=5.3305e-03, MAE_E_per_atom=23.4818 meV, MAE_F=49.4241 meV / A, MAE_stress_per_atom=0.1350 meV / A^3 +2023-12-09 08:29:54.978 INFO: Epoch 71: loss=5.3500e-03, MAE_E_per_atom=23.6442 meV, MAE_F=49.3240 meV / A, MAE_stress_per_atom=0.1316 meV / A^3 +2023-12-09 08:40:50.835 INFO: Process group initialized: True +2023-12-09 08:40:50.837 INFO: Processes: 80 +2023-12-09 08:40:50.837 INFO: MACE version: 0.3.0 +2023-12-09 08:40:50.837 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-09 08:40:50.837 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-09 08:40:50.839 INFO: Using statistics json file +2023-12-09 08:40:50.839 INFO: Using atomic numbers from statistics file +2023-12-09 08:40:50.839 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-09 08:40:50.839 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-09 08:40:50.840 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-09 08:41:27.733 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-09 08:41:27.736 INFO: Average number of neighbors: 61.964672446250916 +2023-12-09 08:41:27.736 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-09 08:41:27.736 INFO: Building model +2023-12-09 08:41:27.737 INFO: Hidden irreps: 128x0e+128x1o +2023-12-09 08:41:32.299 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-09 08:41:32.300 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-71.pt +2023-12-09 08:41:32.522 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-09 08:41:32.529 INFO: Number of parameters: 4688656 +2023-12-09 08:41:32.529 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-09 08:41:32.529 INFO: Using Weights and Biases for logging +2023-12-09 08:41:45.043 INFO: Using gradient clipping with tolerance=100.000 +2023-12-09 08:41:45.043 INFO: Started training +2023-12-09 08:41:52.935 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.935 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.935 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.935 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.939 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:41:52.940 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 08:59:59.821 INFO: Epoch 71: loss=5.3882e-03, MAE_E_per_atom=23.3594 meV, MAE_F=49.5646 meV / A, MAE_stress_per_atom=0.1329 meV / A^3 +2023-12-09 09:10:06.749 INFO: Epoch 72: loss=5.3114e-03, MAE_E_per_atom=23.4361 meV, MAE_F=49.0596 meV / A, MAE_stress_per_atom=0.1359 meV / A^3 +2023-12-09 09:20:10.640 INFO: Epoch 73: loss=5.3660e-03, MAE_E_per_atom=23.1947 meV, MAE_F=49.2526 meV / A, MAE_stress_per_atom=0.1509 meV / A^3 +2023-12-09 09:30:12.150 INFO: Epoch 74: loss=5.2807e-03, MAE_E_per_atom=23.4016 meV, MAE_F=48.7142 meV / A, MAE_stress_per_atom=0.1356 meV / A^3 +2023-12-09 09:40:12.660 INFO: Epoch 75: loss=5.3168e-03, MAE_E_per_atom=23.3488 meV, MAE_F=48.7872 meV / A, MAE_stress_per_atom=0.1511 meV / A^3 +2023-12-09 09:50:12.497 INFO: Epoch 76: loss=5.2920e-03, MAE_E_per_atom=23.2997 meV, MAE_F=48.8116 meV / A, MAE_stress_per_atom=0.1483 meV / A^3 +2023-12-09 10:00:12.367 INFO: Epoch 77: loss=5.2816e-03, MAE_E_per_atom=23.5583 meV, MAE_F=48.7768 meV / A, MAE_stress_per_atom=0.1410 meV / A^3 +2023-12-09 10:10:12.455 INFO: Epoch 78: loss=5.2830e-03, MAE_E_per_atom=23.1176 meV, MAE_F=48.9296 meV / A, MAE_stress_per_atom=0.1322 meV / A^3 +2023-12-09 10:20:14.185 INFO: Epoch 79: loss=5.3040e-03, MAE_E_per_atom=22.9708 meV, MAE_F=48.7202 meV / A, MAE_stress_per_atom=0.1485 meV / A^3 +2023-12-09 10:30:15.738 INFO: Epoch 80: loss=5.2301e-03, MAE_E_per_atom=23.0120 meV, MAE_F=48.4323 meV / A, MAE_stress_per_atom=0.1341 meV / A^3 +2023-12-09 10:59:28.853 INFO: Process group initialized: True +2023-12-09 10:59:28.855 INFO: Processes: 80 +2023-12-09 10:59:28.855 INFO: MACE version: 0.3.0 +2023-12-09 10:59:28.855 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-09 10:59:28.855 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-09 10:59:28.856 INFO: Using statistics json file +2023-12-09 10:59:28.856 INFO: Using atomic numbers from statistics file +2023-12-09 10:59:28.856 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-09 10:59:28.856 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-09 10:59:28.857 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-09 11:00:00.336 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-09 11:00:00.338 INFO: Average number of neighbors: 61.964672446250916 +2023-12-09 11:00:00.338 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-09 11:00:00.339 INFO: Building model +2023-12-09 11:00:00.340 INFO: Hidden irreps: 128x0e+128x1o +2023-12-09 11:00:04.464 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-09 11:00:04.466 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-80.pt +2023-12-09 11:00:04.694 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-09 11:00:04.700 INFO: Number of parameters: 4688656 +2023-12-09 11:00:04.700 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-09 11:00:04.700 INFO: Using Weights and Biases for logging +2023-12-09 11:00:17.936 INFO: Using gradient clipping with tolerance=100.000 +2023-12-09 11:00:17.937 INFO: Started training +2023-12-09 11:00:25.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.626 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.628 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:00:25.629 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 11:17:23.907 INFO: Epoch 80: loss=5.2411e-03, MAE_E_per_atom=22.7064 meV, MAE_F=48.4087 meV / A, MAE_stress_per_atom=0.1348 meV / A^3 +2023-12-09 11:26:29.662 INFO: Epoch 81: loss=5.2813e-03, MAE_E_per_atom=22.8484 meV, MAE_F=48.8738 meV / A, MAE_stress_per_atom=0.1427 meV / A^3 +2023-12-09 11:35:25.385 INFO: Epoch 82: loss=5.2693e-03, MAE_E_per_atom=22.9214 meV, MAE_F=48.5185 meV / A, MAE_stress_per_atom=0.1332 meV / A^3 +2023-12-09 11:44:18.996 INFO: Epoch 83: loss=5.2928e-03, MAE_E_per_atom=22.6126 meV, MAE_F=48.6254 meV / A, MAE_stress_per_atom=0.1335 meV / A^3 +2023-12-09 11:53:13.189 INFO: Epoch 84: loss=5.2988e-03, MAE_E_per_atom=22.5262 meV, MAE_F=48.7944 meV / A, MAE_stress_per_atom=0.1327 meV / A^3 +2023-12-09 12:02:08.964 INFO: Epoch 85: loss=5.2378e-03, MAE_E_per_atom=22.3368 meV, MAE_F=48.3277 meV / A, MAE_stress_per_atom=0.1350 meV / A^3 +2023-12-09 12:11:04.263 INFO: Epoch 86: loss=5.2450e-03, MAE_E_per_atom=22.4312 meV, MAE_F=47.9789 meV / A, MAE_stress_per_atom=0.1449 meV / A^3 +2023-12-09 12:19:57.789 INFO: Epoch 87: loss=5.3568e-03, MAE_E_per_atom=22.3035 meV, MAE_F=48.8584 meV / A, MAE_stress_per_atom=0.1483 meV / A^3 +2023-12-09 12:28:53.760 INFO: Epoch 88: loss=5.2486e-03, MAE_E_per_atom=22.1611 meV, MAE_F=48.0358 meV / A, MAE_stress_per_atom=0.1378 meV / A^3 +2023-12-09 12:37:48.485 INFO: Epoch 89: loss=5.2619e-03, MAE_E_per_atom=22.6406 meV, MAE_F=48.4690 meV / A, MAE_stress_per_atom=0.1467 meV / A^3 +2023-12-09 12:46:46.371 INFO: Epoch 90: loss=5.1996e-03, MAE_E_per_atom=22.1097 meV, MAE_F=48.2876 meV / A, MAE_stress_per_atom=0.1457 meV / A^3 +2023-12-09 12:55:43.092 INFO: Epoch 91: loss=5.2536e-03, MAE_E_per_atom=22.1912 meV, MAE_F=48.2910 meV / A, MAE_stress_per_atom=0.1362 meV / A^3 +2023-12-09 13:06:54.147 INFO: Process group initialized: True +2023-12-09 13:06:54.148 INFO: Processes: 80 +2023-12-09 13:06:54.148 INFO: MACE version: 0.3.0 +2023-12-09 13:06:54.149 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-09 13:06:54.149 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-09 13:06:54.150 INFO: Using statistics json file +2023-12-09 13:06:54.150 INFO: Using atomic numbers from statistics file +2023-12-09 13:06:54.150 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-09 13:06:54.150 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-09 13:06:54.151 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-09 13:07:26.417 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-09 13:07:26.419 INFO: Average number of neighbors: 61.964672446250916 +2023-12-09 13:07:26.419 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-09 13:07:26.419 INFO: Building model +2023-12-09 13:07:26.420 INFO: Hidden irreps: 128x0e+128x1o +2023-12-09 13:07:30.591 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-09 13:07:30.593 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-91.pt +2023-12-09 13:07:30.811 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-09 13:07:30.817 INFO: Number of parameters: 4688656 +2023-12-09 13:07:30.818 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-09 13:07:30.818 INFO: Using Weights and Biases for logging +2023-12-09 13:07:44.037 INFO: Using gradient clipping with tolerance=100.000 +2023-12-09 13:07:44.038 INFO: Started training +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.569 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:07:51.570 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 13:25:42.283 INFO: Epoch 91: loss=5.2827e-03, MAE_E_per_atom=22.3376 meV, MAE_F=48.2865 meV / A, MAE_stress_per_atom=0.1305 meV / A^3 +2023-12-09 13:35:47.242 INFO: Epoch 92: loss=5.2205e-03, MAE_E_per_atom=21.9415 meV, MAE_F=48.0294 meV / A, MAE_stress_per_atom=0.1380 meV / A^3 +2023-12-09 13:45:49.474 INFO: Epoch 93: loss=5.1885e-03, MAE_E_per_atom=22.2376 meV, MAE_F=47.8596 meV / A, MAE_stress_per_atom=0.1339 meV / A^3 +2023-12-09 13:55:51.233 INFO: Epoch 94: loss=5.2357e-03, MAE_E_per_atom=21.9536 meV, MAE_F=48.2543 meV / A, MAE_stress_per_atom=0.1353 meV / A^3 +2023-12-09 14:05:49.174 INFO: Epoch 95: loss=5.2390e-03, MAE_E_per_atom=21.9314 meV, MAE_F=48.0024 meV / A, MAE_stress_per_atom=0.1416 meV / A^3 +2023-12-09 14:15:49.919 INFO: Epoch 96: loss=5.1318e-03, MAE_E_per_atom=22.0857 meV, MAE_F=47.6157 meV / A, MAE_stress_per_atom=0.1296 meV / A^3 +2023-12-09 14:25:50.147 INFO: Epoch 97: loss=6.8578e-03, MAE_E_per_atom=39.0242 meV, MAE_F=61.7531 meV / A, MAE_stress_per_atom=0.1577 meV / A^3 +2023-12-09 14:35:51.631 INFO: Epoch 98: loss=5.2761e-03, MAE_E_per_atom=22.8749 meV, MAE_F=49.3756 meV / A, MAE_stress_per_atom=0.1224 meV / A^3 +2023-12-09 14:45:53.821 INFO: Epoch 99: loss=5.2245e-03, MAE_E_per_atom=22.7225 meV, MAE_F=48.4537 meV / A, MAE_stress_per_atom=0.1278 meV / A^3 +2023-12-09 14:55:56.106 INFO: Epoch 100: loss=5.3241e-03, MAE_E_per_atom=22.3499 meV, MAE_F=49.4637 meV / A, MAE_stress_per_atom=0.1321 meV / A^3 +2023-12-09 16:48:03.969 INFO: Process group initialized: True +2023-12-09 16:48:03.971 INFO: Processes: 80 +2023-12-09 16:48:03.971 INFO: MACE version: 0.3.0 +2023-12-09 16:48:03.972 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-09 16:48:03.972 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-09 16:48:03.972 INFO: Using statistics json file +2023-12-09 16:48:03.972 INFO: Using atomic numbers from statistics file +2023-12-09 16:48:03.972 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-09 16:48:03.972 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-09 16:48:03.973 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-09 16:48:38.250 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-09 16:48:38.253 INFO: Average number of neighbors: 61.964672446250916 +2023-12-09 16:48:38.253 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-09 16:48:38.253 INFO: Building model +2023-12-09 16:48:38.254 INFO: Hidden irreps: 128x0e+128x1o +2023-12-09 16:48:43.048 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-09 16:48:43.051 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-100.pt +2023-12-09 16:48:43.272 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-09 16:48:43.278 INFO: Number of parameters: 4688656 +2023-12-09 16:48:43.278 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.004 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-09 16:48:43.278 INFO: Using Weights and Biases for logging +2023-12-09 16:49:17.193 INFO: Using gradient clipping with tolerance=100.000 +2023-12-09 16:49:17.193 INFO: Started training +2023-12-09 16:49:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.928 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.930 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.929 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 16:49:24.930 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 17:06:54.011 INFO: Epoch 100: loss=5.2960e-03, MAE_E_per_atom=22.1031 meV, MAE_F=48.7585 meV / A, MAE_stress_per_atom=0.1432 meV / A^3 +2023-12-09 17:15:51.131 INFO: Epoch 101: loss=5.8176e-03, MAE_E_per_atom=24.8536 meV, MAE_F=52.5203 meV / A, MAE_stress_per_atom=0.1417 meV / A^3 +2023-12-09 17:24:43.603 INFO: Epoch 102: loss=5.5294e-03, MAE_E_per_atom=23.3563 meV, MAE_F=49.7848 meV / A, MAE_stress_per_atom=0.1267 meV / A^3 +2023-12-09 17:33:39.318 INFO: Epoch 103: loss=5.4688e-03, MAE_E_per_atom=23.0047 meV, MAE_F=48.8278 meV / A, MAE_stress_per_atom=0.1302 meV / A^3 +2023-12-09 17:42:36.467 INFO: Epoch 104: loss=5.4223e-03, MAE_E_per_atom=22.7013 meV, MAE_F=48.2928 meV / A, MAE_stress_per_atom=0.1324 meV / A^3 +2023-12-09 17:51:32.119 INFO: Epoch 105: loss=5.4002e-03, MAE_E_per_atom=22.4328 meV, MAE_F=47.8605 meV / A, MAE_stress_per_atom=0.1309 meV / A^3 +2023-12-09 18:00:32.067 INFO: Epoch 106: loss=5.3613e-03, MAE_E_per_atom=22.1876 meV, MAE_F=47.7299 meV / A, MAE_stress_per_atom=0.1338 meV / A^3 +2023-12-09 18:09:27.328 INFO: Epoch 107: loss=5.3308e-03, MAE_E_per_atom=21.9601 meV, MAE_F=47.5168 meV / A, MAE_stress_per_atom=0.1320 meV / A^3 +2023-12-09 18:18:23.491 INFO: Epoch 108: loss=5.2786e-03, MAE_E_per_atom=22.1091 meV, MAE_F=47.0873 meV / A, MAE_stress_per_atom=0.1330 meV / A^3 +2023-12-09 18:27:19.834 INFO: Epoch 109: loss=5.2978e-03, MAE_E_per_atom=21.9229 meV, MAE_F=47.1072 meV / A, MAE_stress_per_atom=0.1412 meV / A^3 +2023-12-09 18:36:15.585 INFO: Epoch 110: loss=5.2751e-03, MAE_E_per_atom=21.7241 meV, MAE_F=47.2007 meV / A, MAE_stress_per_atom=0.1377 meV / A^3 +2023-12-09 18:49:51.400 INFO: Process group initialized: True +2023-12-09 18:49:51.401 INFO: Processes: 80 +2023-12-09 18:49:51.402 INFO: MACE version: 0.3.0 +2023-12-09 18:49:51.402 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-09 18:49:51.402 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-09 18:49:51.402 INFO: Using statistics json file +2023-12-09 18:49:51.402 INFO: Using atomic numbers from statistics file +2023-12-09 18:49:51.402 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-09 18:49:51.402 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-09 18:49:51.403 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-09 18:50:24.320 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-09 18:50:24.322 INFO: Average number of neighbors: 61.964672446250916 +2023-12-09 18:50:24.322 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-09 18:50:24.322 INFO: Building model +2023-12-09 18:50:24.323 INFO: Hidden irreps: 128x0e+128x1o +2023-12-09 18:50:28.545 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-09 18:50:28.548 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-110.pt +2023-12-09 18:50:28.774 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-09 18:50:28.780 INFO: Number of parameters: 4688656 +2023-12-09 18:50:28.780 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00256 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00256 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00256 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00256 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00256 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-09 18:50:28.780 INFO: Using Weights and Biases for logging +2023-12-09 18:50:44.756 INFO: Using gradient clipping with tolerance=100.000 +2023-12-09 18:50:44.756 INFO: Started training +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.624 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.625 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 18:50:52.627 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-09 19:08:45.641 INFO: Epoch 110: loss=5.2663e-03, MAE_E_per_atom=21.7045 meV, MAE_F=47.1609 meV / A, MAE_stress_per_atom=0.1400 meV / A^3 +2023-12-09 19:18:55.412 INFO: Epoch 111: loss=5.2982e-03, MAE_E_per_atom=21.9979 meV, MAE_F=47.3265 meV / A, MAE_stress_per_atom=0.1368 meV / A^3 +2023-12-09 19:28:55.445 INFO: Epoch 112: loss=5.2541e-03, MAE_E_per_atom=21.7174 meV, MAE_F=47.0514 meV / A, MAE_stress_per_atom=0.1322 meV / A^3 +2023-12-09 19:38:59.738 INFO: Epoch 113: loss=5.2646e-03, MAE_E_per_atom=21.7584 meV, MAE_F=47.2301 meV / A, MAE_stress_per_atom=0.1313 meV / A^3 +2023-12-09 19:49:01.453 INFO: Epoch 114: loss=5.2313e-03, MAE_E_per_atom=21.6573 meV, MAE_F=47.0185 meV / A, MAE_stress_per_atom=0.1289 meV / A^3 +2023-12-09 19:59:02.035 INFO: Epoch 115: loss=5.2972e-03, MAE_E_per_atom=21.7430 meV, MAE_F=46.8115 meV / A, MAE_stress_per_atom=0.1503 meV / A^3 +2023-12-09 20:09:01.999 INFO: Epoch 116: loss=5.2574e-03, MAE_E_per_atom=21.7485 meV, MAE_F=46.9482 meV / A, MAE_stress_per_atom=0.1301 meV / A^3 +2023-12-09 20:19:02.616 INFO: Epoch 117: loss=5.2123e-03, MAE_E_per_atom=21.3550 meV, MAE_F=46.8814 meV / A, MAE_stress_per_atom=0.1310 meV / A^3 +2023-12-09 20:29:03.845 INFO: Epoch 118: loss=5.2398e-03, MAE_E_per_atom=21.6509 meV, MAE_F=46.8091 meV / A, MAE_stress_per_atom=0.1451 meV / A^3 +2023-12-09 20:39:05.567 INFO: Epoch 119: loss=5.2015e-03, MAE_E_per_atom=21.6668 meV, MAE_F=46.7574 meV / A, MAE_stress_per_atom=0.1298 meV / A^3 +2023-12-10 02:15:28.277 INFO: Process group initialized: True +2023-12-10 02:15:28.279 INFO: Processes: 80 +2023-12-10 02:15:28.279 INFO: MACE version: 0.3.0 +2023-12-10 02:15:28.279 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-10 02:15:28.279 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-10 02:15:28.281 INFO: Using statistics json file +2023-12-10 02:15:28.281 INFO: Using atomic numbers from statistics file +2023-12-10 02:15:28.281 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-10 02:15:28.281 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-10 02:15:28.282 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-10 02:16:00.590 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-10 02:16:00.593 INFO: Average number of neighbors: 61.964672446250916 +2023-12-10 02:16:00.593 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-10 02:16:00.593 INFO: Building model +2023-12-10 02:16:00.594 INFO: Hidden irreps: 128x0e+128x1o +2023-12-10 02:16:04.732 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-10 02:16:04.735 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-119.pt +2023-12-10 02:16:04.954 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-10 02:16:04.961 INFO: Number of parameters: 4688656 +2023-12-10 02:16:04.961 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0020480000000000003 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0020480000000000003 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0020480000000000003 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0020480000000000003 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0020480000000000003 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-10 02:16:04.961 INFO: Using Weights and Biases for logging +2023-12-10 02:16:18.090 INFO: Using gradient clipping with tolerance=100.000 +2023-12-10 02:16:18.090 INFO: Started training +2023-12-10 02:16:25.706 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.706 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.706 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.706 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.709 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.709 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.709 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.709 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.710 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:16:25.711 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 02:34:42.920 INFO: Epoch 119: loss=5.1763e-03, MAE_E_per_atom=21.6438 meV, MAE_F=46.7010 meV / A, MAE_stress_per_atom=0.1278 meV / A^3 +2023-12-10 02:44:54.518 INFO: Epoch 120: loss=5.1852e-03, MAE_E_per_atom=21.6082 meV, MAE_F=46.6824 meV / A, MAE_stress_per_atom=0.1257 meV / A^3 +2023-12-10 02:54:56.018 INFO: Epoch 121: loss=5.2197e-03, MAE_E_per_atom=21.5121 meV, MAE_F=46.7248 meV / A, MAE_stress_per_atom=0.1264 meV / A^3 +2023-12-10 03:04:55.826 INFO: Epoch 122: loss=5.1491e-03, MAE_E_per_atom=21.3798 meV, MAE_F=46.4375 meV / A, MAE_stress_per_atom=0.1258 meV / A^3 +2023-12-10 03:14:55.471 INFO: Epoch 123: loss=5.2067e-03, MAE_E_per_atom=21.5762 meV, MAE_F=46.3614 meV / A, MAE_stress_per_atom=0.1422 meV / A^3 +2023-12-10 03:25:01.082 INFO: Epoch 124: loss=5.1858e-03, MAE_E_per_atom=21.5408 meV, MAE_F=46.3587 meV / A, MAE_stress_per_atom=0.1412 meV / A^3 +2023-12-10 03:35:03.958 INFO: Epoch 125: loss=5.1983e-03, MAE_E_per_atom=21.7479 meV, MAE_F=46.4963 meV / A, MAE_stress_per_atom=0.1525 meV / A^3 +2023-12-10 03:45:05.865 INFO: Epoch 126: loss=5.1886e-03, MAE_E_per_atom=21.5340 meV, MAE_F=46.3082 meV / A, MAE_stress_per_atom=0.1361 meV / A^3 +2023-12-10 03:55:09.993 INFO: Epoch 127: loss=5.1727e-03, MAE_E_per_atom=21.4586 meV, MAE_F=46.1418 meV / A, MAE_stress_per_atom=0.1366 meV / A^3 +2023-12-10 04:05:12.637 INFO: Epoch 128: loss=5.2020e-03, MAE_E_per_atom=21.4649 meV, MAE_F=46.3036 meV / A, MAE_stress_per_atom=0.1428 meV / A^3 +2023-12-10 05:52:50.788 INFO: Process group initialized: True +2023-12-10 05:52:50.790 INFO: Processes: 80 +2023-12-10 05:52:50.790 INFO: MACE version: 0.3.0 +2023-12-10 05:52:50.790 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-10 05:52:50.790 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-10 05:52:50.791 INFO: Using statistics json file +2023-12-10 05:52:50.791 INFO: Using atomic numbers from statistics file +2023-12-10 05:52:50.791 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-10 05:52:50.791 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-10 05:52:50.792 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-10 05:53:23.026 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-10 05:53:23.028 INFO: Average number of neighbors: 61.964672446250916 +2023-12-10 05:53:23.028 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-10 05:53:23.028 INFO: Building model +2023-12-10 05:53:23.029 INFO: Hidden irreps: 128x0e+128x1o +2023-12-10 05:53:27.236 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-10 05:53:27.238 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-128.pt +2023-12-10 05:53:27.460 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-10 05:53:27.466 INFO: Number of parameters: 4688656 +2023-12-10 05:53:27.466 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0013107200000000005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0013107200000000005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0013107200000000005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0013107200000000005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0013107200000000005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-10 05:53:27.466 INFO: Using Weights and Biases for logging +2023-12-10 05:53:41.178 INFO: Using gradient clipping with tolerance=100.000 +2023-12-10 05:53:41.179 INFO: Started training +2023-12-10 05:53:48.870 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.870 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.870 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.870 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.872 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 05:53:48.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 06:11:47.007 INFO: Epoch 128: loss=5.1907e-03, MAE_E_per_atom=21.4718 meV, MAE_F=46.2833 meV / A, MAE_stress_per_atom=0.1412 meV / A^3 +2023-12-10 06:21:53.773 INFO: Epoch 129: loss=5.1304e-03, MAE_E_per_atom=21.1537 meV, MAE_F=46.3270 meV / A, MAE_stress_per_atom=0.1313 meV / A^3 +2023-12-10 06:31:51.982 INFO: Epoch 130: loss=5.1097e-03, MAE_E_per_atom=21.1548 meV, MAE_F=46.1861 meV / A, MAE_stress_per_atom=0.1288 meV / A^3 +2023-12-10 06:41:48.746 INFO: Epoch 131: loss=5.1561e-03, MAE_E_per_atom=21.3436 meV, MAE_F=46.2789 meV / A, MAE_stress_per_atom=0.1394 meV / A^3 +2023-12-10 06:51:45.583 INFO: Epoch 132: loss=5.1566e-03, MAE_E_per_atom=21.0486 meV, MAE_F=46.0963 meV / A, MAE_stress_per_atom=0.1448 meV / A^3 +2023-12-10 07:01:44.145 INFO: Epoch 133: loss=5.1539e-03, MAE_E_per_atom=21.2794 meV, MAE_F=46.3039 meV / A, MAE_stress_per_atom=0.1372 meV / A^3 +2023-12-10 07:11:42.364 INFO: Epoch 134: loss=5.1444e-03, MAE_E_per_atom=21.2980 meV, MAE_F=46.0271 meV / A, MAE_stress_per_atom=0.1291 meV / A^3 +2023-12-10 07:21:42.344 INFO: Epoch 135: loss=5.1734e-03, MAE_E_per_atom=21.0557 meV, MAE_F=46.1844 meV / A, MAE_stress_per_atom=0.1402 meV / A^3 +2023-12-10 07:31:42.321 INFO: Epoch 136: loss=5.1606e-03, MAE_E_per_atom=21.1389 meV, MAE_F=46.2687 meV / A, MAE_stress_per_atom=0.1349 meV / A^3 +2023-12-10 07:41:41.951 INFO: Epoch 137: loss=5.1636e-03, MAE_E_per_atom=21.3884 meV, MAE_F=46.1299 meV / A, MAE_stress_per_atom=0.1468 meV / A^3 +2023-12-10 08:09:05.548 INFO: Process group initialized: True +2023-12-10 08:09:05.550 INFO: Processes: 80 +2023-12-10 08:09:05.550 INFO: MACE version: 0.3.0 +2023-12-10 08:09:05.551 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-10 08:09:05.551 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-10 08:09:05.551 INFO: Using statistics json file +2023-12-10 08:09:05.551 INFO: Using atomic numbers from statistics file +2023-12-10 08:09:05.551 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-10 08:09:05.551 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-10 08:09:05.552 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-10 08:09:37.470 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-10 08:09:37.472 INFO: Average number of neighbors: 61.964672446250916 +2023-12-10 08:09:37.472 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-10 08:09:37.472 INFO: Building model +2023-12-10 08:09:37.473 INFO: Hidden irreps: 128x0e+128x1o +2023-12-10 08:09:40.948 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-10 08:09:40.951 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-137.pt +2023-12-10 08:09:41.181 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-10 08:09:41.187 INFO: Number of parameters: 4688656 +2023-12-10 08:09:41.187 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0010485760000000005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0010485760000000005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0010485760000000005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0010485760000000005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0010485760000000005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-10 08:09:41.187 INFO: Using Weights and Biases for logging +2023-12-10 08:09:54.879 INFO: Using gradient clipping with tolerance=100.000 +2023-12-10 08:09:54.879 INFO: Started training +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.730 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:10:02.731 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 08:28:10.963 INFO: Epoch 137: loss=5.1595e-03, MAE_E_per_atom=21.3598 meV, MAE_F=46.1232 meV / A, MAE_stress_per_atom=0.1454 meV / A^3 +2023-12-10 08:38:15.002 INFO: Epoch 138: loss=5.1020e-03, MAE_E_per_atom=21.0281 meV, MAE_F=45.9801 meV / A, MAE_stress_per_atom=0.1290 meV / A^3 +2023-12-10 08:48:05.155 INFO: Epoch 139: loss=5.1466e-03, MAE_E_per_atom=21.0615 meV, MAE_F=46.1580 meV / A, MAE_stress_per_atom=0.1437 meV / A^3 +2023-12-10 08:58:00.714 INFO: Epoch 140: loss=5.1404e-03, MAE_E_per_atom=21.1108 meV, MAE_F=46.1380 meV / A, MAE_stress_per_atom=0.1338 meV / A^3 +2023-12-10 09:07:57.223 INFO: Epoch 141: loss=5.1043e-03, MAE_E_per_atom=20.8992 meV, MAE_F=45.9616 meV / A, MAE_stress_per_atom=0.1318 meV / A^3 +2023-12-10 09:17:50.793 INFO: Epoch 142: loss=5.1329e-03, MAE_E_per_atom=21.1637 meV, MAE_F=45.9460 meV / A, MAE_stress_per_atom=0.1371 meV / A^3 +2023-12-10 09:27:48.228 INFO: Epoch 143: loss=5.1218e-03, MAE_E_per_atom=21.1212 meV, MAE_F=45.6969 meV / A, MAE_stress_per_atom=0.1424 meV / A^3 +2023-12-10 09:37:41.751 INFO: Epoch 144: loss=5.1243e-03, MAE_E_per_atom=21.1469 meV, MAE_F=45.8993 meV / A, MAE_stress_per_atom=0.1451 meV / A^3 +2023-12-10 09:47:36.914 INFO: Epoch 145: loss=5.0752e-03, MAE_E_per_atom=21.0259 meV, MAE_F=45.9411 meV / A, MAE_stress_per_atom=0.1293 meV / A^3 +2023-12-10 09:57:32.914 INFO: Epoch 146: loss=5.1075e-03, MAE_E_per_atom=21.0218 meV, MAE_F=45.5216 meV / A, MAE_stress_per_atom=0.1468 meV / A^3 +2023-12-10 10:04:52.126 INFO: Process group initialized: True +2023-12-10 10:04:52.129 INFO: Processes: 80 +2023-12-10 10:04:52.129 INFO: MACE version: 0.3.0 +2023-12-10 10:04:52.129 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-10 10:04:52.129 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-10 10:04:52.129 INFO: Using statistics json file +2023-12-10 10:04:52.129 INFO: Using atomic numbers from statistics file +2023-12-10 10:04:52.130 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-10 10:04:52.130 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-10 10:04:52.130 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-10 10:05:24.174 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-10 10:05:24.176 INFO: Average number of neighbors: 61.964672446250916 +2023-12-10 10:05:24.176 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-10 10:05:24.176 INFO: Building model +2023-12-10 10:05:24.177 INFO: Hidden irreps: 128x0e+128x1o +2023-12-10 10:05:28.273 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-10 10:05:28.276 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-146.pt +2023-12-10 10:05:28.495 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-10 10:05:28.501 INFO: Number of parameters: 4688656 +2023-12-10 10:05:28.501 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0008388608000000005 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0008388608000000005 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0008388608000000005 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0008388608000000005 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0008388608000000005 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-10 10:05:28.501 INFO: Using Weights and Biases for logging +2023-12-10 10:05:41.024 INFO: Using gradient clipping with tolerance=100.000 +2023-12-10 10:05:41.024 INFO: Started training +2023-12-10 10:05:48.802 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.802 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.802 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.803 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.805 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:05:48.806 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 10:23:35.541 INFO: Epoch 146: loss=5.1094e-03, MAE_E_per_atom=21.0509 meV, MAE_F=45.5229 meV / A, MAE_stress_per_atom=0.1493 meV / A^3 +2023-12-10 10:33:43.510 INFO: Epoch 147: loss=5.1028e-03, MAE_E_per_atom=20.8921 meV, MAE_F=45.7332 meV / A, MAE_stress_per_atom=0.1333 meV / A^3 +2023-12-10 10:43:48.046 INFO: Epoch 148: loss=5.1176e-03, MAE_E_per_atom=20.9933 meV, MAE_F=45.8897 meV / A, MAE_stress_per_atom=0.1339 meV / A^3 +2023-12-10 10:53:47.955 INFO: Epoch 149: loss=5.1038e-03, MAE_E_per_atom=20.7985 meV, MAE_F=45.8389 meV / A, MAE_stress_per_atom=0.1314 meV / A^3 +2023-12-10 11:03:48.923 INFO: Epoch 150: loss=5.1227e-03, MAE_E_per_atom=21.0905 meV, MAE_F=45.8037 meV / A, MAE_stress_per_atom=0.1353 meV / A^3 +2023-12-10 11:13:52.743 INFO: Epoch 151: loss=5.1313e-03, MAE_E_per_atom=21.0618 meV, MAE_F=45.8204 meV / A, MAE_stress_per_atom=0.1498 meV / A^3 +2023-12-10 11:23:55.120 INFO: Epoch 152: loss=5.0875e-03, MAE_E_per_atom=20.8285 meV, MAE_F=45.7948 meV / A, MAE_stress_per_atom=0.1389 meV / A^3 +2023-12-10 11:33:55.902 INFO: Epoch 153: loss=5.1188e-03, MAE_E_per_atom=20.6853 meV, MAE_F=45.8290 meV / A, MAE_stress_per_atom=0.1410 meV / A^3 +2023-12-10 11:43:58.343 INFO: Epoch 154: loss=5.1175e-03, MAE_E_per_atom=20.8767 meV, MAE_F=45.6392 meV / A, MAE_stress_per_atom=0.1477 meV / A^3 +2023-12-10 11:53:57.417 INFO: Epoch 155: loss=5.0824e-03, MAE_E_per_atom=20.7770 meV, MAE_F=45.5690 meV / A, MAE_stress_per_atom=0.1452 meV / A^3 +2023-12-10 12:28:30.208 INFO: Process group initialized: True +2023-12-10 12:28:30.209 INFO: Processes: 80 +2023-12-10 12:28:30.210 INFO: MACE version: 0.3.0 +2023-12-10 12:28:30.210 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-10 12:28:30.210 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-10 12:28:30.211 INFO: Using statistics json file +2023-12-10 12:28:30.211 INFO: Using atomic numbers from statistics file +2023-12-10 12:28:30.211 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-10 12:28:30.211 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-10 12:28:30.212 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-10 12:29:02.109 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-10 12:29:02.112 INFO: Average number of neighbors: 61.964672446250916 +2023-12-10 12:29:02.112 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-10 12:29:02.112 INFO: Building model +2023-12-10 12:29:02.113 INFO: Hidden irreps: 128x0e+128x1o +2023-12-10 12:29:05.524 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-10 12:29:05.527 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-155.pt +2023-12-10 12:29:05.749 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-10 12:29:05.755 INFO: Number of parameters: 4688656 +2023-12-10 12:29:05.755 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0006710886400000004 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0006710886400000004 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0006710886400000004 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0006710886400000004 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0006710886400000004 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-10 12:29:05.755 INFO: Using Weights and Biases for logging +2023-12-10 12:29:19.781 INFO: Using gradient clipping with tolerance=100.000 +2023-12-10 12:29:19.782 INFO: Started training +2023-12-10 12:29:27.471 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.471 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.471 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.471 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.472 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.472 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.472 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.472 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.474 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.475 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.476 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:29:27.477 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 12:47:18.074 INFO: Epoch 155: loss=5.0789e-03, MAE_E_per_atom=20.6669 meV, MAE_F=45.5489 meV / A, MAE_stress_per_atom=0.1472 meV / A^3 +2023-12-10 12:57:31.958 INFO: Epoch 156: loss=5.0970e-03, MAE_E_per_atom=21.0134 meV, MAE_F=45.6445 meV / A, MAE_stress_per_atom=0.1420 meV / A^3 +2023-12-10 13:07:33.314 INFO: Epoch 157: loss=5.0783e-03, MAE_E_per_atom=20.8552 meV, MAE_F=45.7069 meV / A, MAE_stress_per_atom=0.1378 meV / A^3 +2023-12-10 13:17:39.096 INFO: Epoch 158: loss=5.0716e-03, MAE_E_per_atom=20.7272 meV, MAE_F=45.5876 meV / A, MAE_stress_per_atom=0.1299 meV / A^3 +2023-12-10 13:27:40.386 INFO: Epoch 159: loss=5.0798e-03, MAE_E_per_atom=20.8668 meV, MAE_F=45.6102 meV / A, MAE_stress_per_atom=0.1410 meV / A^3 +2023-12-10 13:37:42.002 INFO: Epoch 160: loss=5.0743e-03, MAE_E_per_atom=20.7367 meV, MAE_F=45.6494 meV / A, MAE_stress_per_atom=0.1387 meV / A^3 +2023-12-10 13:47:45.895 INFO: Epoch 161: loss=5.0921e-03, MAE_E_per_atom=20.7296 meV, MAE_F=45.7178 meV / A, MAE_stress_per_atom=0.1400 meV / A^3 +2023-12-10 13:57:47.947 INFO: Epoch 162: loss=5.0688e-03, MAE_E_per_atom=20.7431 meV, MAE_F=45.5047 meV / A, MAE_stress_per_atom=0.1376 meV / A^3 +2023-12-10 14:07:48.241 INFO: Epoch 163: loss=5.0538e-03, MAE_E_per_atom=20.6881 meV, MAE_F=45.4652 meV / A, MAE_stress_per_atom=0.1360 meV / A^3 +2023-12-10 14:17:49.912 INFO: Epoch 164: loss=5.0658e-03, MAE_E_per_atom=20.6825 meV, MAE_F=45.3795 meV / A, MAE_stress_per_atom=0.1414 meV / A^3 +2023-12-10 14:39:26.095 INFO: Process group initialized: True +2023-12-10 14:39:26.097 INFO: Processes: 80 +2023-12-10 14:39:26.097 INFO: MACE version: 0.3.0 +2023-12-10 14:39:26.098 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-10 14:39:26.098 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-10 14:39:26.098 INFO: Using statistics json file +2023-12-10 14:39:26.098 INFO: Using atomic numbers from statistics file +2023-12-10 14:39:26.098 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-10 14:39:26.098 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-10 14:39:26.099 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-10 14:39:58.825 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-10 14:39:58.827 INFO: Average number of neighbors: 61.964672446250916 +2023-12-10 14:39:58.827 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-10 14:39:58.827 INFO: Building model +2023-12-10 14:39:58.828 INFO: Hidden irreps: 128x0e+128x1o +2023-12-10 14:40:02.924 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-10 14:40:02.928 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-164.pt +2023-12-10 14:40:03.153 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-10 14:40:03.160 INFO: Number of parameters: 4688656 +2023-12-10 14:40:03.160 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0005368709120000003 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0005368709120000003 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0005368709120000003 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0005368709120000003 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0005368709120000003 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-10 14:40:03.160 INFO: Using Weights and Biases for logging +2023-12-10 14:40:16.391 INFO: Using gradient clipping with tolerance=100.000 +2023-12-10 14:40:16.392 INFO: Started training +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.248 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.248 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.248 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.248 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.249 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:40:24.250 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 14:58:46.434 INFO: Epoch 164: loss=5.0867e-03, MAE_E_per_atom=20.6885 meV, MAE_F=45.4179 meV / A, MAE_stress_per_atom=0.1442 meV / A^3 +2023-12-10 15:08:57.416 INFO: Epoch 165: loss=5.0778e-03, MAE_E_per_atom=20.8219 meV, MAE_F=45.4273 meV / A, MAE_stress_per_atom=0.1424 meV / A^3 +2023-12-10 15:19:00.190 INFO: Epoch 166: loss=5.0770e-03, MAE_E_per_atom=20.7576 meV, MAE_F=45.5791 meV / A, MAE_stress_per_atom=0.1426 meV / A^3 +2023-12-10 15:29:05.205 INFO: Epoch 167: loss=5.0840e-03, MAE_E_per_atom=20.6081 meV, MAE_F=45.5498 meV / A, MAE_stress_per_atom=0.1414 meV / A^3 +2023-12-10 15:39:06.531 INFO: Epoch 168: loss=5.0697e-03, MAE_E_per_atom=20.6109 meV, MAE_F=45.3615 meV / A, MAE_stress_per_atom=0.1488 meV / A^3 +2023-12-10 15:49:08.021 INFO: Epoch 169: loss=5.0682e-03, MAE_E_per_atom=20.6892 meV, MAE_F=45.3455 meV / A, MAE_stress_per_atom=0.1480 meV / A^3 +2023-12-10 15:59:09.977 INFO: Epoch 170: loss=5.0857e-03, MAE_E_per_atom=20.7561 meV, MAE_F=45.5546 meV / A, MAE_stress_per_atom=0.1491 meV / A^3 +2023-12-10 16:09:08.898 INFO: Epoch 171: loss=5.0902e-03, MAE_E_per_atom=20.5810 meV, MAE_F=45.5621 meV / A, MAE_stress_per_atom=0.1414 meV / A^3 +2023-12-10 16:19:11.443 INFO: Epoch 172: loss=5.0791e-03, MAE_E_per_atom=20.6978 meV, MAE_F=45.4846 meV / A, MAE_stress_per_atom=0.1413 meV / A^3 +2023-12-10 16:29:12.493 INFO: Epoch 173: loss=5.0883e-03, MAE_E_per_atom=20.7109 meV, MAE_F=45.6722 meV / A, MAE_stress_per_atom=0.1429 meV / A^3 +2023-12-10 19:13:10.486 INFO: Process group initialized: True +2023-12-10 19:13:10.487 INFO: Processes: 80 +2023-12-10 19:13:10.488 INFO: MACE version: 0.3.0 +2023-12-10 19:13:10.488 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-10 19:13:10.488 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-10 19:13:10.489 INFO: Using statistics json file +2023-12-10 19:13:10.489 INFO: Using atomic numbers from statistics file +2023-12-10 19:13:10.489 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-10 19:13:10.489 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-10 19:13:10.490 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-10 19:13:42.382 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-10 19:13:42.385 INFO: Average number of neighbors: 61.964672446250916 +2023-12-10 19:13:42.385 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-10 19:13:42.385 INFO: Building model +2023-12-10 19:13:42.386 INFO: Hidden irreps: 128x0e+128x1o +2023-12-10 19:13:46.439 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-10 19:13:46.443 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-173.pt +2023-12-10 19:13:46.674 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-10 19:13:46.680 INFO: Number of parameters: 4688656 +2023-12-10 19:13:46.680 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0004294967296000003 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0004294967296000003 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0004294967296000003 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0004294967296000003 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.0004294967296000003 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-10 19:13:46.680 INFO: Using Weights and Biases for logging +2023-12-10 19:14:03.188 INFO: Using gradient clipping with tolerance=100.000 +2023-12-10 19:14:03.188 INFO: Started training +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.874 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.875 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.877 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:14:10.878 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 19:31:59.119 INFO: Epoch 173: loss=5.0844e-03, MAE_E_per_atom=20.6512 meV, MAE_F=45.6918 meV / A, MAE_stress_per_atom=0.1412 meV / A^3 +2023-12-10 19:42:08.143 INFO: Epoch 174: loss=5.0720e-03, MAE_E_per_atom=20.5848 meV, MAE_F=45.4678 meV / A, MAE_stress_per_atom=0.1419 meV / A^3 +2023-12-10 19:52:10.525 INFO: Epoch 175: loss=5.0704e-03, MAE_E_per_atom=20.6209 meV, MAE_F=45.3605 meV / A, MAE_stress_per_atom=0.1422 meV / A^3 +2023-12-10 20:02:10.377 INFO: Epoch 176: loss=5.0772e-03, MAE_E_per_atom=20.7264 meV, MAE_F=45.3575 meV / A, MAE_stress_per_atom=0.1468 meV / A^3 +2023-12-10 20:12:13.546 INFO: Epoch 177: loss=5.0612e-03, MAE_E_per_atom=20.7068 meV, MAE_F=45.4088 meV / A, MAE_stress_per_atom=0.1438 meV / A^3 +2023-12-10 20:22:15.764 INFO: Epoch 178: loss=5.0646e-03, MAE_E_per_atom=20.5105 meV, MAE_F=45.4172 meV / A, MAE_stress_per_atom=0.1435 meV / A^3 +2023-12-10 20:32:15.837 INFO: Epoch 179: loss=5.0674e-03, MAE_E_per_atom=20.6387 meV, MAE_F=45.5696 meV / A, MAE_stress_per_atom=0.1426 meV / A^3 +2023-12-10 20:42:17.588 INFO: Epoch 180: loss=5.0565e-03, MAE_E_per_atom=20.6626 meV, MAE_F=45.4246 meV / A, MAE_stress_per_atom=0.1414 meV / A^3 +2023-12-10 20:52:18.011 INFO: Epoch 181: loss=5.0501e-03, MAE_E_per_atom=20.5411 meV, MAE_F=45.4564 meV / A, MAE_stress_per_atom=0.1398 meV / A^3 +2023-12-10 21:02:20.261 INFO: Epoch 182: loss=5.0523e-03, MAE_E_per_atom=20.6437 meV, MAE_F=45.3312 meV / A, MAE_stress_per_atom=0.1427 meV / A^3 +2023-12-10 23:01:30.580 INFO: Process group initialized: True +2023-12-10 23:01:30.581 INFO: Processes: 80 +2023-12-10 23:01:30.581 INFO: MACE version: 0.3.0 +2023-12-10 23:01:30.582 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-10 23:01:30.582 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-10 23:01:30.582 INFO: Using statistics json file +2023-12-10 23:01:30.582 INFO: Using atomic numbers from statistics file +2023-12-10 23:01:30.582 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-10 23:01:30.582 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-10 23:01:30.583 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-10 23:02:00.756 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-10 23:02:00.758 INFO: Average number of neighbors: 61.964672446250916 +2023-12-10 23:02:00.758 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-10 23:02:00.758 INFO: Building model +2023-12-10 23:02:00.759 INFO: Hidden irreps: 128x0e+128x1o +2023-12-10 23:02:04.192 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-10 23:02:04.196 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-182.pt +2023-12-10 23:02:04.433 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-10 23:02:04.440 INFO: Number of parameters: 4688656 +2023-12-10 23:02:04.440 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-10 23:02:04.440 INFO: Using Weights and Biases for logging +2023-12-10 23:02:21.901 INFO: Using gradient clipping with tolerance=100.000 +2023-12-10 23:02:21.901 INFO: Started training +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.581 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.581 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.581 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.581 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.582 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:02:29.583 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-10 23:20:39.718 INFO: Epoch 182: loss=5.0514e-03, MAE_E_per_atom=20.6305 meV, MAE_F=45.3551 meV / A, MAE_stress_per_atom=0.1429 meV / A^3 +2023-12-10 23:30:44.160 INFO: Epoch 183: loss=5.0622e-03, MAE_E_per_atom=20.5708 meV, MAE_F=45.4217 meV / A, MAE_stress_per_atom=0.1402 meV / A^3 +2023-12-10 23:40:47.378 INFO: Epoch 184: loss=5.0498e-03, MAE_E_per_atom=20.5559 meV, MAE_F=45.2165 meV / A, MAE_stress_per_atom=0.1429 meV / A^3 +2023-12-10 23:50:48.747 INFO: Epoch 185: loss=5.0545e-03, MAE_E_per_atom=20.5096 meV, MAE_F=45.3298 meV / A, MAE_stress_per_atom=0.1442 meV / A^3 +2023-12-11 00:00:47.739 INFO: Epoch 186: loss=5.0409e-03, MAE_E_per_atom=20.4267 meV, MAE_F=45.2885 meV / A, MAE_stress_per_atom=0.1446 meV / A^3 +2023-12-11 00:10:54.287 INFO: Epoch 187: loss=5.0671e-03, MAE_E_per_atom=20.5766 meV, MAE_F=45.4311 meV / A, MAE_stress_per_atom=0.1446 meV / A^3 +2023-12-11 00:20:56.375 INFO: Epoch 188: loss=5.0511e-03, MAE_E_per_atom=20.5455 meV, MAE_F=45.4358 meV / A, MAE_stress_per_atom=0.1405 meV / A^3 +2023-12-11 00:30:57.211 INFO: Epoch 189: loss=5.0396e-03, MAE_E_per_atom=20.4700 meV, MAE_F=45.3003 meV / A, MAE_stress_per_atom=0.1409 meV / A^3 +2023-12-11 00:40:58.282 INFO: Epoch 190: loss=5.0358e-03, MAE_E_per_atom=20.5185 meV, MAE_F=45.2424 meV / A, MAE_stress_per_atom=0.1417 meV / A^3 +2023-12-11 00:50:58.753 INFO: Epoch 191: loss=5.0568e-03, MAE_E_per_atom=20.4939 meV, MAE_F=45.4265 meV / A, MAE_stress_per_atom=0.1371 meV / A^3 +2023-12-11 05:17:40.331 INFO: Process group initialized: True +2023-12-11 05:17:40.333 INFO: Processes: 80 +2023-12-11 05:17:40.333 INFO: MACE version: 0.3.0 +2023-12-11 05:17:40.333 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-11 05:17:40.333 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-11 05:17:40.334 INFO: Using statistics json file +2023-12-11 05:17:40.334 INFO: Using atomic numbers from statistics file +2023-12-11 05:17:40.334 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-11 05:17:40.334 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-11 05:17:40.335 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-11 05:18:12.616 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-11 05:18:12.619 INFO: Average number of neighbors: 61.964672446250916 +2023-12-11 05:18:12.619 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-11 05:18:12.619 INFO: Building model +2023-12-11 05:18:12.620 INFO: Hidden irreps: 128x0e+128x1o +2023-12-11 05:18:16.815 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-11 05:18:16.818 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-191.pt +2023-12-11 05:18:17.042 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-11 05:18:17.047 INFO: Number of parameters: 4688656 +2023-12-11 05:18:17.048 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-11 05:18:17.048 INFO: Using Weights and Biases for logging +2023-12-11 05:18:33.110 INFO: Using gradient clipping with tolerance=100.000 +2023-12-11 05:18:33.110 INFO: Started training +2023-12-11 05:18:40.611 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.611 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.611 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.611 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.611 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.611 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.611 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.611 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.612 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:18:40.613 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 05:36:59.743 INFO: Epoch 191: loss=5.0583e-03, MAE_E_per_atom=20.5353 meV, MAE_F=45.4343 meV / A, MAE_stress_per_atom=0.1364 meV / A^3 +2023-12-11 05:47:06.420 INFO: Epoch 192: loss=5.0480e-03, MAE_E_per_atom=20.5238 meV, MAE_F=45.2994 meV / A, MAE_stress_per_atom=0.1433 meV / A^3 +2023-12-11 05:57:11.590 INFO: Epoch 193: loss=5.0366e-03, MAE_E_per_atom=20.4898 meV, MAE_F=45.1909 meV / A, MAE_stress_per_atom=0.1459 meV / A^3 +2023-12-11 06:07:12.461 INFO: Epoch 194: loss=5.0248e-03, MAE_E_per_atom=20.5298 meV, MAE_F=45.1347 meV / A, MAE_stress_per_atom=0.1403 meV / A^3 +2023-12-11 06:17:15.153 INFO: Epoch 195: loss=5.0304e-03, MAE_E_per_atom=20.3783 meV, MAE_F=45.1651 meV / A, MAE_stress_per_atom=0.1370 meV / A^3 +2023-12-11 06:27:16.630 INFO: Epoch 196: loss=5.0286e-03, MAE_E_per_atom=20.3210 meV, MAE_F=45.2389 meV / A, MAE_stress_per_atom=0.1347 meV / A^3 +2023-12-11 06:37:16.462 INFO: Epoch 197: loss=5.0332e-03, MAE_E_per_atom=20.4994 meV, MAE_F=45.1488 meV / A, MAE_stress_per_atom=0.1416 meV / A^3 +2023-12-11 06:47:17.958 INFO: Epoch 198: loss=5.0401e-03, MAE_E_per_atom=20.5324 meV, MAE_F=45.2630 meV / A, MAE_stress_per_atom=0.1396 meV / A^3 +2023-12-11 06:57:21.191 INFO: Epoch 199: loss=5.0398e-03, MAE_E_per_atom=20.4504 meV, MAE_F=45.3322 meV / A, MAE_stress_per_atom=0.1413 meV / A^3 +2023-12-11 06:57:21.430 INFO: Training complete +2023-12-11 06:57:21.430 INFO: Computing metrics for training, validation, and test sets +2023-12-11 06:57:21.480 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-199.pt +2023-12-11 06:57:21.810 INFO: Loaded model from epoch 199 +2023-12-11 06:57:21.811 INFO: Evaluating train ... +2023-12-11 07:00:36.777 INFO: Evaluating valid ... +2023-12-11 07:00:38.265 INFO: ++-------------+--------------------+-----------------+------------------+ +| config_type | MAE E / meV / atom | MAE F / meV / A | relative F MAE % | ++-------------+--------------------+-----------------+------------------+ +| train | 23.1 | 41.3 | 26.15 | +| valid | 20.7 | 45.2 | 32.17 | ++-------------+--------------------+-----------------+------------------+ +2023-12-11 07:00:38.265 INFO: Saving model to checkpoints/03-faster-02_run-1.model +2023-12-11 07:00:38.499 INFO: Done +2023-12-11 17:15:32.644 INFO: Process group initialized: True +2023-12-11 17:15:32.646 INFO: Processes: 80 +2023-12-11 17:15:32.646 INFO: MACE version: 0.3.0 +2023-12-11 17:15:32.646 INFO: Configuration: Namespace(name='03-faster-02', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=1, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='03-faster-02', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight']) +2023-12-11 17:15:32.646 INFO: CUDA version: 11.8, CUDA device: 0 +2023-12-11 17:15:32.647 INFO: Using statistics json file +2023-12-11 17:15:32.647 INFO: Using atomic numbers from statistics file +2023-12-11 17:15:32.647 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94) +2023-12-11 17:15:32.647 INFO: Atomic Energies not in training file, using command line argument E0s +2023-12-11 17:15:32.648 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245] +2023-12-11 17:16:05.506 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000) +2023-12-11 17:16:05.509 INFO: Average number of neighbors: 61.964672446250916 +2023-12-11 17:16:05.509 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False} +2023-12-11 17:16:05.509 INFO: Building model +2023-12-11 17:16:05.510 INFO: Hidden irreps: 128x0e+128x1o +2023-12-11 17:16:10.270 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint. +2023-12-11 17:16:10.273 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-199.pt +2023-12-11 17:16:10.495 INFO: ScaleShiftMACE( + (node_embedding): LinearNodeEmbeddingBlock( + (linear): Linear(89x0e -> 128x0e | 11392 weights) + ) + (radial_embedding): RadialEmbeddingBlock( + (bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False) + (cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0) + ) + (spherical_harmonics): SphericalHarmonics() + (atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828]) + (interactions): ModuleList( + (0): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e -> 128x0e | 16384 weights) + (conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512] + (linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e+128x1o | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + (1): RealAgnosticResidualInteractionBlock( + (linear_up): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + (conv_tp): TensorProduct(128x0e+128x1o x 1x0e+1x1o+1x2e+1x3o -> 256x0e+384x1o+384x2e+256x3o | 1280 paths | 1280 weights) + (conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 1280] + (linear): Linear(256x0e+384x1o+384x2e+256x3o -> 128x0e+128x1o+128x2e+128x3o | 163840 weights) + (skip_tp): FullyConnectedTensorProduct(128x0e+128x1o x 89x0e -> 128x0e | 1458176 paths | 1458176 weights) + (reshape): reshape_irreps() + ) + ) + (products): ModuleList( + (0): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + (1): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x6x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e+128x1o -> 128x0e+128x1o | 32768 weights) + ) + (1): EquivariantProductBasisBlock( + (symmetric_contractions): SymmetricContraction( + (contractions): ModuleList( + (0): Contraction( + (contractions_weighting): ModuleList( + (0-1): 2 x GraphModule() + ) + (contractions_features): ModuleList( + (0-1): 2 x GraphModule() + ) + (weights): ParameterList( + (0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)] + (1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)] + ) + (graph_opt_main): GraphModule() + ) + ) + ) + (linear): Linear(128x0e -> 128x0e | 16384 weights) + ) + ) + (readouts): ModuleList( + (0): LinearReadoutBlock( + (linear): Linear(128x0e+128x1o -> 1x0e | 128 weights) + ) + (1): NonLinearReadoutBlock( + (linear_1): Linear(128x0e -> 16x0e | 2048 weights) + (non_linearity): Activation [x] (16x0e -> 16x0e) + (linear_2): Linear(16x0e -> 1x0e | 16 weights) + ) + ) + (scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097) +) +2023-12-11 17:16:10.502 INFO: Number of parameters: 4688656 +2023-12-11 17:16:10.502 INFO: Optimizer: Adam ( +Parameter Group 0 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: embedding + weight_decay: 0.0 + +Parameter Group 1 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: interactions_decay + weight_decay: 1e-08 + +Parameter Group 2 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: interactions_no_decay + weight_decay: 0.0 + +Parameter Group 3 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: products + weight_decay: 1e-08 + +Parameter Group 4 + amsgrad: True + betas: (0.9, 0.999) + capturable: False + differentiable: False + eps: 1e-08 + foreach: None + fused: None + lr: 0.00034359738368000027 + maximize: False + name: readouts + weight_decay: 0.0 +) +2023-12-11 17:16:10.502 INFO: Using Weights and Biases for logging +2023-12-11 17:16:31.996 INFO: Using gradient clipping with tolerance=100.000 +2023-12-11 17:16:31.996 INFO: Started training +2023-12-11 17:16:39.696 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.696 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.696 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.696 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.696 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.697 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.697 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.698 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.698 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.696 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.697 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.697 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.698 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.698 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.696 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.697 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.697 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.698 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.698 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.696 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.697 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.697 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.698 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.698 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.701 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:16:39.702 INFO: Reducer buckets have been rebuilt in this iteration. +2023-12-11 17:34:33.771 INFO: Epoch 199: loss=5.0447e-03, MAE_E_per_atom=20.4583 meV, MAE_F=45.3405 meV / A, MAE_stress_per_atom=0.1426 meV / A^3 +2023-12-11 17:34:34.082 INFO: Training complete +2023-12-11 17:34:34.082 INFO: Computing metrics for training, validation, and test sets +2023-12-11 17:34:34.088 INFO: Loading checkpoint: checkpoints/03-faster-02_run-1_epoch-199.pt +2023-12-11 17:34:34.459 INFO: Loaded model from epoch 199 +2023-12-11 17:34:34.460 INFO: Evaluating train ... +2023-12-11 17:39:34.503 INFO: Evaluating valid ... +2023-12-11 17:39:37.134 INFO: ++-------------+--------------------+-----------------+------------------+ +| config_type | MAE E / meV / atom | MAE F / meV / A | relative F MAE % | ++-------------+--------------------+-----------------+------------------+ +| train | 23.0 | 41.1 | 26.01 | +| valid | 20.5 | 45.3 | 32.25 | ++-------------+--------------------+-----------------+------------------+ +2023-12-11 17:39:37.134 INFO: Saving model to checkpoints/03-faster-02_run-1.model +2023-12-11 17:39:37.338 INFO: Done