1 Introduction
-
We propose ShadingNet, the first end-to-end model for learning the fine-grained shading decompositions (photometric effects) of natural scenes (Fig. 2).
-
We specifically design the model to couple a shading decomposition with a reflectance prediction to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability.
-
We systematically analyze the quality and contributions of the fine-grained shading decompositions using quantitative and qualitative evaluations on seven different datasets (NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD), achieving superior performance compared with state-of-the-art models estimating a unified shading map.
2 Related Work
3 Fine-Grained Shading Decomposition
3.1 Standard Image Formation Model
3.2 Image Formation Model with Composite Shading
4 Dataset
4.1 Natural Environments Dataset (NED)
4.2 Fine-grained Shading Rendering Pipeline
5 Method
5.1 ShadingNet
5.1.1 Network Details
5.1.2 Training Details
5.2 Baselines
6 Experiments and Evaluation
6.1 Models
6.2 Datasets
SMSE - NED | LMSE - NED | DSSIM - NED | SMSE - Sintel | LMSE - Sintel | DSSIM - Sintel | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Albedo | Shading | Albedo | Shading | Albedo | Shading | Albedo | Shading | Albedo | Shading | Albedo | Shading | |
STAR | 0.0174 | 0.0134 | 0.0512 | 0.0486 | 0.4927 | 0.2351 | 0.0242 | 0.0279 | 0.0588 | 0.0610 | 0.3020 | 0.2646 |
USI3D | 0.0081 | 0.0143 | 0.0360 | 0.0608 | 0.1886 | 0.2140 | 0.0212 | 0.0304 | 0.0507 | 0.0656 | 0.2688 | 0.2335 |
IIDWW | 0.0149 | 0.0175 | 0.0447 | 0.0698 | 0.2229 | 0.2346 | 0.0216 | 0.0273 | 0.0515 | 0.0678 | 0.2672 | 0.2612 |
InverseRenderNet | 0.0478 | 0.0505 | 0.0642 | 0.2597 | 0.2751 | 0.3382 | 0.0388 | 0.0446 | 0.0578 | 0.1132 | 0.3069 | 0.2797 |
DirectIntrinsics | 0.0089 | 0.0120 | 0.0412 | 0.0680 | 0.2116 | 0.2408 | 0.0257 | 0.0322 | 0.0645 | 0.0780 | 0.3255 | 0.2890 |
ShapeNet | 0.0075 | 0.0079 | 0.0276 | 0.0338 | 0.1216 | 0.1176 | 0.0243 | 0.0329 | 0.0562 | 0.0726 | 0.2258 | 0.2071 |
IntrinsicNet | 0.0114 | 0.0138 | 0.0333 | 0.0503 | 0.3707 | 0.4583 | 0.0248 | 0.0320 | 0.0546 | 0.0600 | 0.2077 | 0.2165 |
ParCNN | 0.0045 | 0.0052 | 0.0197 | 0.0272 | 0.1129 | 0.0952 | 0.0210 | 0.0271 | 0.0461 | 0.0723 | 0.2251 | 0.1902 |
Baseline-a | 0.0072 | 0.0082 | 0.0259 | 0.0387 | 0.1159 | 0.1266 | 0.0233 | 0.0366 | 0.0561 | 0.0708 | 0.2396 | 0.2316 |
Baseline-b | 0.0075 | 0.0084 | 0.0280 | 0.0385 | 0.1192 | 0.1340 | 0.0217 | 0.0323 | 0.0519 | 0.0666 | 0.2390 | 0.2214 |
ShadingNet (Ours) | 0.0027 | 0.0037 | 0.0122 | 0.0212 | 0.0798 | 0.0788 | 0.0199 | 0.0249 | 0.0448 | 0.0683 | 0.1991 | 0.1896 |
6.3 Evaluations on NED, MPI Sintel and GTA V
SMSE | LMSE | DSSIM | |
---|---|---|---|
STAR | 0.0165 | 0.0767 | 0.3029 |
USI3D | 0.0129 | 0.0676 | 0.2642 |
IIDWW | 0.0146 | 0.0723 | 0.2713 |
InverseRenderNet | 0.0198 | 0.0884 | 0.2837 |
DirectIntrinsics | 0.0146 | 0.0800 | 0.2981 |
ShapeNet | 0.0138 | 0.0603 | 0.1771 |
IntrinsicNet | 0.0128 | 0.0603 | 0.1989 |
ParCNN | 0.0151 | 0.0656 | 0.4331 |
Baseline-a | 0.0145 | 0.0622 | 0.1883 |
Baseline-b | 0.0134 | 0.0612 | 0.1851 |
ShadingNet (Ours) | 0.0124 | 0.0590 | 0.1698 |
6.4 Evaluations on Intrinsic Images in the Wild (IIW)
WHDR\(~\downarrow \) | |
---|---|
STAR | 36.21 |
USI3D | 36.69 |
IIDWW | 21.60 |
InverseRenderNet | 36.05 |
DirectIntrinsics | 41.64 |
ShapeNet | 40.33 |
IntrinsicNet | 38.17 |
ParCNN | 40.06 |
Baseline-a | 46.22 |
Baseline-b | 39.11 |
ShadingNet (Ours) | 35.73 |
ShadingNet (Ours)\(^{*}\) | 29.98 |
SMSE | LMSE | DSSIM | |||||||
---|---|---|---|---|---|---|---|---|---|
Albedo | Shading | Average | Albedo | Shading | Average | Albedo | Shading | Average | |
STAR | 0.0137 | 0.0114 | 0.0126 | 0.0614 | 0.0672 | 0.0643 | 0.1196 | 0.0825 | 0.1011 |
USI3D | 0.0156 | 0.0102 | 0.0129 | 0.0640 | 0.0474 | 0.0557 | 0.1158 | 0.1310 | 0.1234 |
IIDWW | 0.0126 | 0.0105 | 0.0116 | 0.0591 | 0.0457 | 0.0524 | 0.1049 | 0.1159 | 0.1104 |
InverseRenderNet | 0.0234 | 0.0137 | 0.0186 | 0.0573 | 0.0957 | 0.0765 | 0.1148 | 0.1276 | 0.1212 |
DirectIntrinsics | 0.0164 | 0.0093 | 0.0129 | 0.0683 | 0.0449 | 0.0566 | 0.1218 | 0.1159 | 0.1189 |
ShapeNet | 0.0207 | 0.0106 | 0.0157 | 0.0606 | 0.0595 | 0.0601 | 0.1027 | 0.0886 | 0.0957 |
IntrinsicNet | 0.0191 | 0.0089 | 0.0140 | 0.0618 | 0.0407 | 0.0513 | 0.0905 | 0.0989 | 0.0947 |
ParCNN | 0.0109 | 0.0086 | 0.0098 | 0.0462 | 0.0537 | 0.0500 | 0.0929 | 0.0999 | 0.0964 |
Baseline-a | 0.0141 | 0.0089 | 0.0115 | 0.0523 | 0.0548 | 0.0536 | 0.0929 | 0.0947 | 0.0938 |
Baseline-b | 0.0156 | 0.0086 | 0.0121 | 0.0563 | 0.0522 | 0.0543 | 0.0939 | 0.0953 | 0.0946 |
ShadingNet (Ours) | 0.0107 | 0.0071 | 0.0089 | 0.0390 | 0.0447 | 0.0419 | 0.0758 | 0.0865 | 0.0812 |
6.5 Evaluations on MIT Intrinsic Images
6.6 Evaluations on 3DRMS of Outdoor Garden Scenes
6.7 Evaluations on Shadow Removal Dataset (SRD)
6.8 Evaluation of the Refinement Module
SMSE | LMSE | DSSIM | |
---|---|---|---|
(a) Evaluations on NED (when there is no domain gap) | |||
\(\rho _{a^{+}}\) | 0.0032 | 0.0160 | 0.1005 |
\(\rho _{a^{-}}\) | 0.0030 | 0.0144 | 0.0910 |
\(\rho _{u}\) | 0.0030 | 0.0157 | 0.0982 |
\(\rho \) | 0.0027 | 0.0122 | 0.0798 |
(b) Evaluations on GTA V (as cross dataset generalization) | |||
\(\rho _{a^{+}}\) | 0.0130 | 0.0620 | 0.2169 |
\(\rho _{a^{-}}\) | 0.0125 | 0.0598 | 0.1970 |
\(\rho _{u}\) | 0.0127 | 0.0613 | 0.2096 |
\(\rho \) | 0.0124 | 0.0590 | 0.1968 |
6.9 Evaluation of the Fine-Grained Shadings
SMSE (\(e^{+}_a\)) | SMSE (\(e^{-}_a\)) | SMSE (\(e_d\)) | |
---|---|---|---|
Baseline-a | 0.0155 | 0.0256 | 0.0545 |
Baseline-b | 0.0162 | 0.0293 | 0.0579 |
ShadingNet (Ours) | 0.0103 | 0.0209 | 0.0459 |