Skip to main content
Erschienen in: Journal of Scientific Computing 1/2024

01.04.2024

Haar-Like Wavelets on Hierarchical Trees

verfasst von: Rick Archibald, Ben Whitney

Erschienen in: Journal of Scientific Computing | Ausgabe 1/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Discrete wavelet methods, originally formulated in the setting of regularly sampled signals, can be adapted to data defined on a point cloud if some multiresolution structure is imposed on the cloud. A wide variety of hierarchical clustering algorithms can be used for this purpose, and the multiresolution structure obtained can be encoded by a hierarchical tree of subsets of the cloud. Prior work introduced the use of Haar-like bases defined with respect to such trees for approximation and learning tasks on unstructured data. This paper builds on that work in two directions. First, we present an algorithm for constructing Haar-like bases on general discrete hierarchical trees. Second, with an eye towards data compression, we present thresholding techniques for data defined on a point cloud with error controlled in the \(L^{\infty }\) norm and in a Hölder-type norm. In a concluding trio of numerical examples, we apply our methods to compress a point cloud dataset, study the tightness of the \(L^{\infty }\) error bound, and use thresholding to identify MNIST classifiers with good generalizability.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Theorem 1 differs slightly in the assumptions made of the hierarchical tree, but the proof of [36, Theorem 2] carries over with minimal modification.
 
2
The stipulation that the bound depend only on \({\underline{B}}\) and \({\overline{B}}\), and not on size of the tree, is essential, and in fact a bound in terms of \({\underline{B}}\) and the size of the tree always holds (take \(a_{i} = 1\) in Theorem 3).
 
3
We sample \([{20\,\mathrm{\%}}, {100\,\mathrm{\%}}]\) with lower resolution because we observe very little change in the behavior of the classifiers between the 20 % and 100 % (unthresholded) coefficient retention levels.
 
Literatur
1.
Zurück zum Zitat Robinson, A.H., Cherry, C.: Results of a prototype television bandwidth compression scheme. Proc. IEEE 55(3), 356–364 (1967)CrossRef Robinson, A.H., Cherry, C.: Results of a prototype television bandwidth compression scheme. Proc. IEEE 55(3), 356–364 (1967)CrossRef
2.
Zurück zum Zitat Bradley, Stevan D.: Optimizing a scheme for run length encoding. Proc. IEEE 57(1), 108–109 (1969)CrossRef Bradley, Stevan D.: Optimizing a scheme for run length encoding. Proc. IEEE 57(1), 108–109 (1969)CrossRef
3.
Zurück zum Zitat Hauck, Edward L.: Data compression using run length encoding and statistical encoding, December 2 (1986). US Patent 4,626,829 Hauck, Edward L.: Data compression using run length encoding and statistical encoding, December 2 (1986). US Patent 4,626,829
4.
Zurück zum Zitat Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)MathSciNetCrossRef Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)MathSciNetCrossRef
5.
Zurück zum Zitat Pavlov, Igor: LZMA specification (draft), (June 2015) Pavlov, Igor: LZMA specification (draft), (June 2015)
6.
Zurück zum Zitat Nevill-Manning, C.G., Witten, I.H.: Identifying hierarchical structure in sequences: a linear-time algorithm. J. Artif. Intell. Res. 7, 67–82 (1997)CrossRef Nevill-Manning, C.G., Witten, I.H.: Identifying hierarchical structure in sequences: a linear-time algorithm. J. Artif. Intell. Res. 7, 67–82 (1997)CrossRef
7.
Zurück zum Zitat Mandagere, N., Zhou, P., Smith, MA., Uttamchandani, S.: Demystifying data deduplication. In: Proceedings of the ACM/IFIP/USENIX Middleware ’08 Conference Companion, pp. 12–17, (2008) Mandagere, N., Zhou, P., Smith, MA., Uttamchandani, S.: Demystifying data deduplication. In: Proceedings of the ACM/IFIP/USENIX Middleware ’08 Conference Companion, pp. 12–17, (2008)
8.
Zurück zum Zitat Manber, U.: Finding similar files in a large file system. In: USENIX Winter 1994 Technical Conference Proceedings, vol. 94, pp. 1–10, (1994) Manber, U.: Finding similar files in a large file system. In: USENIX Winter 1994 Technical Conference Proceedings, vol. 94, pp. 1–10, (1994)
9.
Zurück zum Zitat Xia, Wen, J., Hong, F., Dan, H., Yu: S.: A similarity-locality based near-exact deduplication scheme with low ram overhead and high throughput. In: Proceedings of the 2011 USENIX Annual Technical Conference, pp. 26–30, (2011) Xia, Wen, J., Hong, F., Dan, H., Yu: S.: A similarity-locality based near-exact deduplication scheme with low ram overhead and high throughput. In: Proceedings of the 2011 USENIX Annual Technical Conference, pp. 26–30, (2011)
10.
Zurück zum Zitat Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), 18–34 (1992)CrossRef Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), 18–34 (1992)CrossRef
11.
Zurück zum Zitat Grgic, S., Kers, K., Grgic, M.: Image compression using wavelets. In: Proceedings of the IEEE International Symposium on Industrial Electronics. ISIE ‘99, vol. 1, pp. 99–104, (1999) Grgic, S., Kers, K., Grgic, M.: Image compression using wavelets. In: Proceedings of the IEEE International Symposium on Industrial Electronics. ISIE ‘99, vol. 1, pp. 99–104, (1999)
12.
Zurück zum Zitat Marcellin, M.W., Gormish, M.J., Bilgin, A., Boliek, M.P.: An overview of JPEG-2000. In: Proceedings DCC 2000. Data Compression Conference, pp. 523–541, (2000) Marcellin, M.W., Gormish, M.J., Bilgin, A., Boliek, M.P.: An overview of JPEG-2000. In: Proceedings DCC 2000. Data Compression Conference, pp. 523–541, (2000)
13.
Zurück zum Zitat Tang, Xiaoli, Pearlman, William A: Lossy-to-lossless block-based compression of hyperspectral volumetric data. In: 2004 International Conference on Image Processing. ICIP ’04., vol. 5, pp. 3283–3286. IEEE, (2004) Tang, Xiaoli, Pearlman, William A: Lossy-to-lossless block-based compression of hyperspectral volumetric data. In: 2004 International Conference on Image Processing. ICIP ’04., vol. 5, pp. 3283–3286. IEEE, (2004)
14.
Zurück zum Zitat Lindstrom, Peter: Fixed-rate compressed floating-point arrays. IEEE Trans. Visual Comput. Gr. 20(12), 2674–2683 (2014)CrossRef Lindstrom, Peter: Fixed-rate compressed floating-point arrays. IEEE Trans. Visual Comput. Gr. 20(12), 2674–2683 (2014)CrossRef
15.
Zurück zum Zitat Li, Shaomeng, Jaroszynski, Stanislaw, Pearse, Scott, Orf, Leigh, Clyne, John: VAPOR: a visualization package tailored to analyze simulation data in earth system science. Atmosphere 10(9), 488 (2019)CrossRef Li, Shaomeng, Jaroszynski, Stanislaw, Pearse, Scott, Orf, Leigh, Clyne, John: VAPOR: a visualization package tailored to analyze simulation data in earth system science. Atmosphere 10(9), 488 (2019)CrossRef
16.
Zurück zum Zitat Ainsworth, Mark, Tugluk, Ozan, Whitney, Ben, Klasky, Scott: Multilevel techniques for compression and reduction of scientific data–quantitative control of accuracy in derived quantities. SIAM J. Sci. Comput. 41(4), A2146–A2171 (2019)MathSciNetCrossRef Ainsworth, Mark, Tugluk, Ozan, Whitney, Ben, Klasky, Scott: Multilevel techniques for compression and reduction of scientific data–quantitative control of accuracy in derived quantities. SIAM J. Sci. Comput. 41(4), A2146–A2171 (2019)MathSciNetCrossRef
17.
Zurück zum Zitat Austin, W., Ballard, G., Kolda, T.G.: Parallel tensor compression for large-scale scientific data. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 912–922, (2016) Austin, W., Ballard, G., Kolda, T.G.: Parallel tensor compression for large-scale scientific data. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 912–922, (2016)
18.
Zurück zum Zitat Ballester-Ripoll, Rafael, Lindstrom, Peter, Pajarola, Renato: TTHRESH: tensor compression for multidimensional visual data. IEEE Trans. Visual Comput. Gr. 26(9), 2891–2903 (2020)CrossRef Ballester-Ripoll, Rafael, Lindstrom, Peter, Pajarola, Renato: TTHRESH: tensor compression for multidimensional visual data. IEEE Trans. Visual Comput. Gr. 26(9), 2891–2903 (2020)CrossRef
19.
Zurück zum Zitat Wu, Qing, Xia, Tian, Yu, Yizhou: Hierarchical tensor approximation of multidimensional images. In: 2007 IEEE International Conference on Image Processing, vol. 4, pp. 49–52. IEEE, (2007) Wu, Qing, Xia, Tian, Yu, Yizhou: Hierarchical tensor approximation of multidimensional images. In: 2007 IEEE International Conference on Image Processing, vol. 4, pp. 49–52. IEEE, (2007)
20.
Zurück zum Zitat Jiang, W.W., Kiang, S.Z., Hakim, N.Z., Meadows, H.E.: Lossless compression for medical imaging systems using linear/nonlinear prediction and arithmetic coding. In: ISCAS ‘93, IEEE International Symposium on Circuits and Systems, vol. 1, pp. 283–286, (1993) Jiang, W.W., Kiang, S.Z., Hakim, N.Z., Meadows, H.E.: Lossless compression for medical imaging systems using linear/nonlinear prediction and arithmetic coding. In: ISCAS ‘93, IEEE International Symposium on Circuits and Systems, vol. 1, pp. 283–286, (1993)
21.
Zurück zum Zitat Lindstrom, Peter, Isenburg, Martin: Fast and efficient compression of floating-point data. IEEE Trans. Visual Comput. Gr. 12(5), 1245–1250 (2006)CrossRef Lindstrom, Peter, Isenburg, Martin: Fast and efficient compression of floating-point data. IEEE Trans. Visual Comput. Gr. 12(5), 1245–1250 (2006)CrossRef
22.
Zurück zum Zitat Roelofs, Greg: PNG: The Definitive Guide. O’Reilly Media, Sebastopol (1999) Roelofs, Greg: PNG: The Definitive Guide. O’Reilly Media, Sebastopol (1999)
23.
Zurück zum Zitat Bautista Gomez, LA., Cappello, F: Improving floating point compression through binary masks. In: 2013 IEEE International Conference on Big Data, pp. 326–331, (2013) Bautista Gomez, LA., Cappello, F: Improving floating point compression through binary masks. In: 2013 IEEE International Conference on Big Data, pp. 326–331, (2013)
24.
Zurück zum Zitat Di, S., Cappello, F.: Fast error-bounded lossy HPC data compression with SZ. In: 2016 IEEE 30th International Parallel and Distributed Processing Symposium, Chicago, IL, USA, pp. 730–739 (2016). IEEE Di, S., Cappello, F.: Fast error-bounded lossy HPC data compression with SZ. In: 2016 IEEE 30th International Parallel and Distributed Processing Symposium, Chicago, IL, USA, pp. 730–739 (2016). IEEE
25.
Zurück zum Zitat Tao, D., Di, S., Chen, Z., Cappello, F.: Significantly improving lossy compression for scientific data sets based on multidimensional prediction and error-controlled quantization. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 1129–1139, Orlando, FL, USA, (2017). IEEE Tao, D., Di, S., Chen, Z., Cappello, F.: Significantly improving lossy compression for scientific data sets based on multidimensional prediction and error-controlled quantization. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 1129–1139, Orlando, FL, USA, (2017). IEEE
26.
Zurück zum Zitat Ainsworth, Mark, Tugluk, Ozan, Whitney, Ben, Klasky, Scott: Multilevel techniques for compression and reduction of scientific data–the unstructured case. SIAM J. Sci. Comput. 42(2), A1402–A1427 (2020)MathSciNetCrossRef Ainsworth, Mark, Tugluk, Ozan, Whitney, Ben, Klasky, Scott: Multilevel techniques for compression and reduction of scientific data–the unstructured case. SIAM J. Sci. Comput. 42(2), A1402–A1427 (2020)MathSciNetCrossRef
27.
Zurück zum Zitat Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P.: The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30(3), 83–98 (2013)CrossRef Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P.: The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30(3), 83–98 (2013)CrossRef
28.
Zurück zum Zitat Avena, Luca, Castell, Fabienne, Gaudillière, Alexandre, Mélot, Clothilde: Intertwining wavelets or multiresolution analysis on graphs through random forests. Appl. Comput. Harmon. Anal. 48(3), 949–992 (2020)MathSciNetCrossRef Avena, Luca, Castell, Fabienne, Gaudillière, Alexandre, Mélot, Clothilde: Intertwining wavelets or multiresolution analysis on graphs through random forests. Appl. Comput. Harmon. Anal. 48(3), 949–992 (2020)MathSciNetCrossRef
29.
30.
Zurück zum Zitat Hammond, David K., Vandergheynst, Pierre, Gribonval, Rémi.: Wavelets on graphs via spectral graph theory. Appl. Comput. Harmon. Anal. 30(2), 129–150 (2011)MathSciNetCrossRef Hammond, David K., Vandergheynst, Pierre, Gribonval, Rémi.: Wavelets on graphs via spectral graph theory. Appl. Comput. Harmon. Anal. 30(2), 129–150 (2011)MathSciNetCrossRef
32.
Zurück zum Zitat Lee, Ann B., Nadler, Boaz, Wasserman, Larry: Treelets–an adaptive multi-scale basis for sparse unordered data. Ann. Appl. Stat. 2(2), 435–471 (2008)MathSciNet Lee, Ann B., Nadler, Boaz, Wasserman, Larry: Treelets–an adaptive multi-scale basis for sparse unordered data. Ann. Appl. Stat. 2(2), 435–471 (2008)MathSciNet
33.
Zurück zum Zitat Elisha, Oren, Dekel, Shai: Wavelet decompositions of random forests: smoothness analysis, sparse approximation and applications. J. Mach. Learn. Res. 17(1), 6952–6989 (2016)MathSciNet Elisha, Oren, Dekel, Shai: Wavelet decompositions of random forests: smoothness analysis, sparse approximation and applications. J. Mach. Learn. Res. 17(1), 6952–6989 (2016)MathSciNet
34.
Zurück zum Zitat Salloum, Maher, Fabian, Nathan D., Hensinger, David M., Lee, Jina, Allendorf, Elizabeth M., Bhagatwala, Ankit, Blaylock, Myra L., Chen, Jacqueline H., Templeton, Jeremy A., Tezaur, Irina: Optimal compressed sensing and reconstruction of unstructured mesh datasets. Data Sci. Eng. 3(1), 1–23 (2018)CrossRef Salloum, Maher, Fabian, Nathan D., Hensinger, David M., Lee, Jina, Allendorf, Elizabeth M., Bhagatwala, Ankit, Blaylock, Myra L., Chen, Jacqueline H., Templeton, Jeremy A., Tezaur, Irina: Optimal compressed sensing and reconstruction of unstructured mesh datasets. Data Sci. Eng. 3(1), 1–23 (2018)CrossRef
35.
Zurück zum Zitat Bender, EA., Williamson, SG: Lists, decisions and graphs. S. Gill Williamson, (2010) Bender, EA., Williamson, SG: Lists, decisions and graphs. S. Gill Williamson, (2010)
36.
Zurück zum Zitat Gavish, Matan, Nadler, Boaz, Coifman, Ronald R: Multiscale wavelets on trees, graphs and high dimensional data: theory and applications to semi supervised learning. In: ICML, pp. 367–374, (2010) Gavish, Matan, Nadler, Boaz, Coifman, Ronald R: Multiscale wavelets on trees, graphs and high dimensional data: theory and applications to semi supervised learning. In: ICML, pp. 367–374, (2010)
37.
Zurück zum Zitat Shapiro, Jerome M.: Embedded image coding using Zerotrees of wavelet coefficients. IEEE Trans. Signal Process. 41(12), 3445–3462 (1993)CrossRef Shapiro, Jerome M.: Embedded image coding using Zerotrees of wavelet coefficients. IEEE Trans. Signal Process. 41(12), 3445–3462 (1993)CrossRef
38.
39.
Zurück zum Zitat Shilov, Georgi E., Silverman, Richard A., et al.: Elementary real and complex analysis. Courier Corporation, Chelmsford (1996) Shilov, Georgi E., Silverman, Richard A., et al.: Elementary real and complex analysis. Courier Corporation, Chelmsford (1996)
40.
Zurück zum Zitat Bentley, Jon Louis: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)CrossRef Bentley, Jon Louis: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)CrossRef
41.
Zurück zum Zitat LeCun, Yann, Bottou, Léon., Bengio, Yoshua, Haffner, Patrick: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef LeCun, Yann, Bottou, Léon., Bengio, Yoshua, Haffner, Patrick: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
43.
Zurück zum Zitat Halko, Nathan, Martinsson, Per-Gunnar., Tropp, Joel A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)MathSciNetCrossRef Halko, Nathan, Martinsson, Per-Gunnar., Tropp, Joel A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)MathSciNetCrossRef
44.
Zurück zum Zitat Pedregosa, Fabian, Varoquaux, Gaël., Gramfort, Alexandre, Michel, Vincent, Thirion, Bertrand, Grisel, Olivier, Blondel, Mathieu, Prettenhofer, Peter, Weiss, Ron, Dubourg, Vincent, Vanderplas, Jake, Passos, Alexandre, Cournapeau, David, Brucher, Matthieu, Perrot, Matthieu, Duchesnay, Edouard: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNet Pedregosa, Fabian, Varoquaux, Gaël., Gramfort, Alexandre, Michel, Vincent, Thirion, Bertrand, Grisel, Olivier, Blondel, Mathieu, Prettenhofer, Peter, Weiss, Ron, Dubourg, Vincent, Vanderplas, Jake, Passos, Alexandre, Cournapeau, David, Brucher, Matthieu, Perrot, Matthieu, Duchesnay, Edouard: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNet
45.
Zurück zum Zitat Linderman, George C., Rachh, Manas, Hoskins, Jeremy G., Steinerberger, Stefan, Kluger, Yuval: Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nat. Methods 16(3), 243–245 (2019)CrossRef Linderman, George C., Rachh, Manas, Hoskins, Jeremy G., Steinerberger, Stefan, Kluger, Yuval: Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nat. Methods 16(3), 243–245 (2019)CrossRef
46.
Zurück zum Zitat Poličar, Pavlin G., Stražar, Martin, Zupan, Blaž: openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding. bioRxiv, (2019) Poličar, Pavlin G., Stražar, Martin, Zupan, Blaž: openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding. bioRxiv, (2019)
47.
Zurück zum Zitat Sattath, Shmuel, Tversky, Amos: Additive similarity trees. Psychometrika 42(3), 319–345 (1977)CrossRef Sattath, Shmuel, Tversky, Amos: Additive similarity trees. Psychometrika 42(3), 319–345 (1977)CrossRef
48.
Zurück zum Zitat Bertsekas, Dimitri: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999) Bertsekas, Dimitri: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)
Metadaten
Titel
Haar-Like Wavelets on Hierarchical Trees
verfasst von
Rick Archibald
Ben Whitney
Publikationsdatum
01.04.2024
Verlag
Springer US
Erschienen in
Journal of Scientific Computing / Ausgabe 1/2024
Print ISSN: 0885-7474
Elektronische ISSN: 1573-7691
DOI
https://doi.org/10.1007/s10915-024-02466-9

Weitere Artikel der Ausgabe 1/2024

Journal of Scientific Computing 1/2024 Zur Ausgabe

Premium Partner