1 Introduction
2 State-of-the-art and related work
3 Methodology
3.1 Encoder–decoder model architecture
3.2 \(\hbox {U}^p\)-Net: data fusion through multi-domain inputs
3.3 \(\hbox {U}^p\)-Net design considerations
3.4 Recursive time stepping
4 Results
4.1 2D wave equation
4.1.1 Data generation
4.1.2 Model setup and training methodology
adam
optimizer for 300 epochs with a learning rate decaying from \(10^{-4}\) to \(10^{-5}\), followed by another 50 epochs with a learning rate decay from \(10^{-5}\) to \(10^{-7}\). We found this training strategy beneficial for the long-term prediction capabilities of the models, see Appendix C for details.4.1.3 Single-step predictions
Model | Architecture | Loss (nMSE) | MSE | MAE | MEM |
---|---|---|---|---|---|
\(\mathcal {U}^{3}_{64}\) | \(\hbox {U}^p\)-Net | \(9.37 \times 10^{-8}\) | \(1.44 \times 10^{-9}\) | \(5.97 \times 10^{-6}\) | \(5.95 \times 10^{-7}\) |
\(\mathcal {U}^{3}_{29, \times 2}\) | \(\hbox {U}^p\)-Net | \(6.12 \times 10^{-8}\) | \(1.27 \times 10^{-9}\) | \(6.28 \times 10^{-6}\) | \(7.15 \times 10^{-7}\) |
\(\mathcal {U}_{96}\) | U-Net | \(7.15 \times 10^{-7}\) | \(8.54 \times 10^{-9}\) | \(9.28 \times 10^{-6}\) | \(1.32 \times 10^{-6}\) |
\(\mathcal {U}_{46, \times 2}\) | U-Net | \(4.85 \times 10^{-7}\) | \(7.69 \times 10^{-9}\) | \(1.12 \times 10^{-5}\) | \(1.72 \times 10^{-6}\) |
\(\mathcal {U}^{\textrm{ED}}_{106}\) | ED | \(3.16 \times 10^{-6}\) | \(3.91 \times 10^{-8}\) | \(3.09 \times 10^{-5}\) | \(4.45 \times 10^{-6}\) |
\(\mathcal {U}^{\textrm{ED}}_{48, \times 2}\) | ED | \(3.43 \times 10^{-6}\) | \(4.41 \times 10^{-8}\) | \(3.44 \times 10^{-5}\) | \(5.92 \times 10^{-6}\) |
4.1.4 Long-term predictions
Model | \(\hbox {MSE}_{100}\) | \(\hbox {MAE}_{100}\) | ||||
---|---|---|---|---|---|---|
Mean ± stdev | Min | Max | Mean ± stdev | Min | Max | |
\(\mathcal {U}^{3}_{64}\) | \(9.38\pm 9.77\times 10^{-6}\) | \(1.02\times 10^{-6}\) | \(3.99\times 10^{-5}\) | \(2.08\pm 1.12\times 10^{-3}\) | \(7.34\times 10^{-4}\) | \(4.97\times 10^{-3}\) |
\(\mathcal {U}^{3}_{29, \times 2}\) | \(2.01\pm 1.93\times 10^{-5}\) | \(2.93\times 10^{-6}\) | \(7.07\times 10^{-5}\) | \(2.93\pm 1.34\times 10^{-3}\) | \(1.15\times 10^{-3}\) | \(6.22\times 10^{-3}\) |
\(\mathcal {U}_{96}\) | \(2.58\pm 2.04\times 10^{-5}\) | \(3.78\times 10^{-6}\) | \(8.11\times 10^{-5}\) | \(3.90\pm 1.48\times 10^{-3}\) | \(1.57\times 10^{-3}\) | \(6.91\times 10^{-3}\) |
\(\mathcal {U}_{46, \times 2}\) | \(1.82\pm 3.96\times 10^{-2}\) | \(6.40\times 10^{-6}\) | \(1.82\times 10^{-1}\) | \(3.06\pm 4.90\times 10^{-2}\) | \(1.52\times 10^{-3}\) | \(1.84\times 10^{-1}\) |
\(\mathcal {U}^{\textrm{ED}}_{106}\) | \(0.98\pm 1.45\times 10^{-3}\) | \(1.46\times 10^{-5}\) | \(7.47\times 10^{-3}\) | \(1.54\pm 0.91\times 10^{-2}\) | \(2.89\times 10^{-3}\) | \(3.69\times 10^{-2}\) |
\(\mathcal {U}^{\textrm{ED}}_{48, \times 2}\) | \(2.51\pm 0.13\times 10^{+3}\) | \(4.49\times 10^{-5}\) | \(7.02\times 10^{+4}\) | \(0.20\pm 1.0\times 10^{+1}\) | \(5.63\times 10^{-3}\) | \(5.42\times 10^{+1}\) |
4.1.5 Out-of-distribution generalization properties
4.2 Heat equation
4.2.1 Data generation
4.2.2 Model setup
4.2.3 Predictions
Model | Architecture | \(f_{0/1/2}\) | Parameters | MSE | MAE | MEM |
---|---|---|---|---|---|---|
\(\mathcal {U}^{3}_{64}\) | \(\hbox {U}^p\)-Net | 64 | 991,617 | \(3.29 \times 10^{-7}\) | \(2.46 \times 10^{-4}\) | \(4.29 \times 10^{-5}\) |
\(\mathcal {U}_{96}\) | U-Net | 96 | 991,681 | \(3.82 \times 10^{-5}\) | \(4.41 \times 10^{-4}\) | \(2.18 \times 10^{-3}\) |
\(\mathcal {U}^{\textrm{ED}}_{106}\) | ED | 106 | 1,006,153 | \(1.45 \times 10^{-4}\) | \(9.89 \times 10^{-3}\) | \(9.14 \times 10^{-3}\) |