1 Introduction
2 Related Work
2.1 Multi-image
2.2 Single Image
2.3 Deep Learning Approaches
3 Silhouette and Depth: A Multi-task Loss
3.1 Silhouette
3.2 Depth
4 Implementation
4.1 Loss Function
4.2 Improved Loss Functions
4.3 Architecture
4.4 3D Decoder
Dataset | Train | Val | Test | # of views |
---|---|---|---|---|
SketchFab
| 372 | 20 | 33 | 5 |
SynthSculptures
| 77 | – | – | 5 |
ShapeNet
| 4744 | 678 | 1356 | 24 |
5 Dataset
5.1 Sculpture Datasets
5.2 ShapeNet
6 Experiments
6.1 Training Setup
6.1.1 Evaluation Measure
SketchFab
|
SynthSculpture
| \(L_1\) Depth error | Silhouette IoU | |
---|---|---|---|---|
Augment? | Used? | Augment? | ||
✗ | ✗ | – | 0.210 | 0.643 |
✓ | ✗ | – | 0.202 | 0.719 |
✗ | ✓ | ✗ | 0.209 | 0.678 |
✓ | ✓ | ✓ | 0.201 | 0.724 |
Model | Input size | Output size | Pooling? | Improved loss? | Depth \(L_{1_{256\times 256}}\) error | Silhouette \(\hbox {IoU}_{256\times 256}\) |
---|---|---|---|---|---|---|
\(\hbox {SiDeNet}_{\mathrm{basic}}\)
|
\(256\times 256\)
|
\(256\times 256\)
| Max | ✗ | 0.201 | 0.724 |
SiDeNet |
\(256\times 256\)
|
\(256\times 256\)
| Max |
\(\checkmark \)
| 0.181 | 0.739 |
SiDeNet |
\(256\times 256\)
|
\(256\times 256\)
| Avg |
\(\checkmark \)
| 0.189 | 0.734 |
\(\hbox {SiDeNet}_{57\times 57_{\mathrm{basic}}}\)
|
\(256\times 256\)
|
\(57\times 57\)
| Max | ✗ | – | 0.723 |
\(\hbox {SiDeNet}_{57\times 57}\)
|
\(256\times 256\)
|
\(57\times 57\)
| Max |
\(\checkmark \)
| 0.195 | 0.734 |
SiDeNet3D |
\(256\times 256\)
|
\(57\times 57\)
| Max |
\(\checkmark \)
| 0.182 | 0.733 |
Baseline: \(z=c\) | – | – | – | – | 0.223 | – |
6.1.2 Evaluation Setup
6.2 The Effect of the Data Augmentation
6.3 Ablation Study of the Different Architectures
6.4 The effect of using \(\mathcal {L}_{depth}\) and \(\mathcal {L}_{sil}\)
6.5 The effect of increasing the number of views
Loss function |
\(\lambda _{depth}\)
|
\(\lambda _{sil} \)
| Depth \(L_1\) error | Silhouette IoU |
---|---|---|---|---|
Silhouette and depth | 1 | 1 | 0.181 | 0.739 |
Silhouette | – | – | – | 0.734 |
Depth | – | – | 0.178 | – |
Pooling? | # Views (train) | # Views (test) | \(L_1\) Depth error | Silhouette IoU |
---|---|---|---|---|
Max | 1 | 1 | 0.206 | 0.702 |
Max | 1 | 2 | 0.210 | 0.712 |
Max | 1 | 3 | 0.209 | 0.716 |
Max | 2 | 1 | 0.204 | 0.694 |
Max | 2 | 2 | 0.181 | 0.739 |
Max | 2 | 3 | 0.170 | 0.751 |
Avg | 2 | 1 | 0.197 | 0.715 |
Avg | 2 | 2 | 0.192 | 0.725 |
Avg | 2 | 3 | 0.189 | 0.732 |
Max | 3 | 1 | 0.198 | 0.706 |
Max | 3 | 2 | 0.172 | 0.753 |
Max | 3 | 3 | 0.162 | 0.766 |
6.6 The Effect of Non-photometrically Consistent Inputs
6.7 Comparison on ShapeNet
Views have the same texture? | # Views (test) | \(L_1\) Depth error | Silhouette IoU |
---|---|---|---|
\(\checkmark \)
| 1 | 0.165 | 0.739 |
\(\checkmark \)
| 2 | 0.142 | 0.778 |
\(\checkmark \)
| 3 | 0.139 | 0.785 |
✗ | 1 | 0.164 | 0.738 |
✗ | 2 | 0.143 | 0.777 |
✗ | 3 | 0.139 | 0.785 |
Pre-training | Number of views tested with | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
Yan et al. (2016) | ShapeNet | 0.797 | – | – | – | – |
SiDeNet | Sculptures | 0.831 | 0.845 | 0.850 | 0.852 | 0.853 |
SiDeNet | – | 0.826 | 0.843 | 0.848 | 0.850 | 0.851 |
\(\hbox {SiDeNet}_{256\times 256_{\mathrm{basic}}}\)
| – | 0.814 | 0.831 | 0.835 | 0.837 | 0.837 |
\(\hbox {SiDeNet}_{57\times 57_{\mathrm{basic}}}\)
| – | 0.775 | 0.791 | 0.795 | 0.796 | 0.795 |
6.8 The Effect of Varying \(\theta '\)
Model | Trained with: | Evaluation is on: | Number of views tested with | ||||
---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 6 | |||
SiDeNet | Silhouettes + depth | Depth | 1.47 | 0.72 | 0.62 | 0.59 | 0.58 |
Kar et al. (2017) | Depth | Depth | 1.73 | 0.82 | 0.71 | 0.67 | 0.65 |
Model | Trained with | Evaluation is on: | Number of views tested with | ||
---|---|---|---|---|---|
1 | 2 | 3 | |||
SiDeNet3D | Silhouettes + depth | 3D | 0.87 | 0.82 | 0.81 |
Kar et al. (2017) | Depth | Depth | 2.15 | 1.38 | 1.15 |
Tatarchenko et al. (2016) | Depth | Depth | 1.97 | – | – |
Yan et al. (2016) | Silhouettes | 3D | 1.26 | – | – |
Groueix et al. (2018) | 3D | 3D | 1.23 | – | – |