Skip to main content
Top
Published in: The Journal of Supercomputing 6/2015

01-06-2015

Algorithmic aspects of graph reduction for hardware/software partitioning

Authors: Guiyuan Jiang, Jigang Wu, Siew-Kei Lam, Thambipillai Srikanthan, Jizhou Sun

Published in: The Journal of Supercomputing | Issue 6/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The hardware/software (HW/SW) partitioning is a major concern in heterogeneous multi-processor system-on-a-chip design, where the large design space prohibits rapid identification of optimal HW/SW solutions to meet tight time-to-market constraints. In this paper, we propose graph reduction techniques to reduce the design space for HW/SW partitioning without sacrificing the partition quality. There are two major phases in the proposed approach: reducible sub-graph searching and sub-graph evaluation and reduction. In the former phase, we design a dynamic programming-based algorithm, named path flow algorithm (PFA), to identify reducible sub-graph candidates for directed acyclic graph (DAG) as most previous works use DAG as task graph model. We also propose algorithm DeLoop to transform an arbitrary directed graph into a DAG such that all reducible sub-graphs on the original graph can be detected by performing algorithm PFA on the DAG. Our approach overcomes the limitation of the existing approach by enabling the identification of candidate sub-graphs in arbitrary task graphs. In latter phase, we propose a reduction model which enables accurate estimation of task execution time on hardware and design a method to select candidate sub-graphs for reduction. Experimental results demonstrate that the proposed methods not only reduce the design space, but also notably improve the partitioning quality since hardware-parallel execution of tasks is taken into account in the proposed sub-graph reduction model.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Comput Gr Forum 8(1):3–11 Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Comput Gr Forum 8(1):3–11
2.
go back to reference Arabnia HR (1990) A parallel algorithm for the arbitrary rotation of digitized images using process-and-data-decomposition approach. J Parallel Distrib Comput. 10(2):188–192CrossRef Arabnia HR (1990) A parallel algorithm for the arbitrary rotation of digitized images using process-and-data-decomposition approach. J Parallel Distrib Comput. 10(2):188–192CrossRef
3.
go back to reference Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor theoretical properties and algorithms. Parallel Comput 21(11):1783–1805CrossRef Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor theoretical properties and algorithms. Parallel Comput 21(11):1783–1805CrossRef
4.
go back to reference Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th annual international high performance computing conference, pp 349–357 Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th annual international high performance computing conference, pp 349–357
5.
go back to reference Arató P, Mann ZÁ, Orbán A (2005) Algorithmic aspects of hardware/software partitioning. ACM Trans Des Autom Electron Syst (TODAES) 10(1):136–156 Arató P, Mann ZÁ, Orbán A (2005) Algorithmic aspects of hardware/software partitioning. ACM Trans Des Autom Electron Syst (TODAES) 10(1):136–156
6.
go back to reference Tahaee SA, Jahangir AH (2010) A polynomial algorithm for partitioning problems. ACM Trans Embed Comput Syst (TECS) 9(4):1–38. (Article 34) Tahaee SA, Jahangir AH (2010) A polynomial algorithm for partitioning problems. ACM Trans Embed Comput Syst (TECS) 9(4):1–38. (Article 34)
7.
go back to reference Vahid F, Gajski DD, Gong J (1994) A binary-constraint search algorithm for minimizing hardware during hardware/software partitioning. In: Proceedings of the conference on the European design automation (EURO-DAC), pp 214–219 Vahid F, Gajski DD, Gong J (1994) A binary-constraint search algorithm for minimizing hardware during hardware/software partitioning. In: Proceedings of the conference on the European design automation (EURO-DAC), pp 214–219
8.
go back to reference Dick RP, Jha NK (1997) Mogac: a multiobjective genetic algorithm for the co-synthesis of hardware–software embedded systems, In: Proceedings of 1997 IEEE/ACM international conference on computer-aided design (ICCAD), pp 522–529 Dick RP, Jha NK (1997) Mogac: a multiobjective genetic algorithm for the co-synthesis of hardware–software embedded systems, In: Proceedings of 1997 IEEE/ACM international conference on computer-aided design (ICCAD), pp 522–529
9.
go back to reference López-Vallejo M, López JC (2003) On the hardware–software partitioning problem: system modeling and partitioning techniques. ACM Trans Des Autom Electron Syst (TODAES) 8(3):269–297CrossRef López-Vallejo M, López JC (2003) On the hardware–software partitioning problem: system modeling and partitioning techniques. ACM Trans Des Autom Electron Syst (TODAES) 8(3):269–297CrossRef
10.
go back to reference Vahid F (2002) Partitioning sequential programs for cad using a three-step approach. ACM Trans Des Autom Electron Syst (TODAES) 7(3):413–429CrossRef Vahid F (2002) Partitioning sequential programs for cad using a three-step approach. ACM Trans Des Autom Electron Syst (TODAES) 7(3):413–429CrossRef
11.
go back to reference Wu J, Srikanthan T, Tao J (2008) Algorithmic aspects for functional partitioning and scheduling in hardware/software co-design. Des Autom Embed Syst 12(4):345–375CrossRef Wu J, Srikanthan T, Tao J (2008) Algorithmic aspects for functional partitioning and scheduling in hardware/software co-design. Des Autom Embed Syst 12(4):345–375CrossRef
12.
go back to reference Henkel J (1999) A low power hardware/software partitioning approach for core-based embedded systems. In: Proceedings of the 36th ACM/IEEE design automation conference (DAC), pp 122–127 Henkel J (1999) A low power hardware/software partitioning approach for core-based embedded systems. In: Proceedings of the 36th ACM/IEEE design automation conference (DAC), pp 122–127
13.
go back to reference Stitt G, Vahid F (2002) The energy advantages of microprocessor platforms with on-chip configurable logic. IEEE Des Test Comput 19(6):36–43CrossRef Stitt G, Vahid F (2002) The energy advantages of microprocessor platforms with on-chip configurable logic. IEEE Des Test Comput 19(6):36–43CrossRef
14.
go back to reference Mu J, Lysecky R (2009) Autonomous hardware/software partitioning and voltage/frequency scaling for low-power embedded systems. ACM Trans Des Autom Electron Syst (TODAES) 15(1):1–20CrossRef Mu J, Lysecky R (2009) Autonomous hardware/software partitioning and voltage/frequency scaling for low-power embedded systems. ACM Trans Des Autom Electron Syst (TODAES) 15(1):1–20CrossRef
15.
go back to reference Lysecky R (2007) Low-power warp processor for power efficient high-performance embedded systems. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), pp 1–6 Lysecky R (2007) Low-power warp processor for power efficient high-performance embedded systems. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), pp 1–6
16.
go back to reference Banerjee S, Dutt N (2004) Efficient search space exploration for HW–SW partitioning. In: Proceedings of the IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis (CODES+ISSS), pp 122–127 Banerjee S, Dutt N (2004) Efficient search space exploration for HW–SW partitioning. In: Proceedings of the IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis (CODES+ISSS), pp 122–127
17.
go back to reference Wu J, Srikanthan T, Chen G (2010) Algorithmic aspects of hardware/software partitioning: 1D search algorithms. IEEE Trans Comput 59(4):532–544CrossRefMathSciNet Wu J, Srikanthan T, Chen G (2010) Algorithmic aspects of hardware/software partitioning: 1D search algorithms. IEEE Trans Comput 59(4):532–544CrossRefMathSciNet
18.
go back to reference Tahaee SA, Jahangir AH, Habibi-Masouleh H (2008) Improving the performance of heuristic searches with judicious initial point selection. In: Proceedings of the 5th IEEE international symposium on embedded computing, pp 14–19 Tahaee SA, Jahangir AH, Habibi-Masouleh H (2008) Improving the performance of heuristic searches with judicious initial point selection. In: Proceedings of the 5th IEEE international symposium on embedded computing, pp 14–19
19.
go back to reference Wu J, Srikanthan T (2006) Low-complex dynamic programming algorithm for hardware/software partitioning. Inf Process Lett 98(2):41–46CrossRefMATH Wu J, Srikanthan T (2006) Low-complex dynamic programming algorithm for hardware/software partitioning. Inf Process Lett 98(2):41–46CrossRefMATH
20.
go back to reference Niemann R, Marwedel P (1997) An algorithm for hardware/software partitioning using mixed integer linear programming. Des Autom Embed Syst 2(2):165–193CrossRef Niemann R, Marwedel P (1997) An algorithm for hardware/software partitioning using mixed integer linear programming. Des Autom Embed Syst 2(2):165–193CrossRef
21.
go back to reference Wu J, Wang P, Lam SK, Srikanthan T (2013) Efficient heuristic and tabu search for hardware/software partitioning. J Supercomput 66(1):118–134CrossRef Wu J, Wang P, Lam SK, Srikanthan T (2013) Efficient heuristic and tabu search for hardware/software partitioning. J Supercomput 66(1):118–134CrossRef
22.
go back to reference Monshizadeh N, Trentelman HL, Camlibel MK (2014) Projection based model reduction of multi-agent systems using graph partitions. IEEE Trans Control Netw Syst 1(2):145–154CrossRefMathSciNet Monshizadeh N, Trentelman HL, Camlibel MK (2014) Projection based model reduction of multi-agent systems using graph partitions. IEEE Trans Control Netw Syst 1(2):145–154CrossRefMathSciNet
23.
go back to reference Shen C, Tong W (2013) A novel method of adopting graph reduction for resource management in parallel computing model. In: Proceedings of the 2013 IEEE international conference on green computing and communications and ieee internet of things and IEEE cyber, physical and social computing, pp 2218–2220 Shen C, Tong W (2013) A novel method of adopting graph reduction for resource management in parallel computing model. In: Proceedings of the 2013 IEEE international conference on green computing and communications and ieee internet of things and IEEE cyber, physical and social computing, pp 2218–2220
24.
go back to reference Song Y, Shi G (2012) Hierarchical graph reduction approach to symbolic circuit analysis with data sharing and cancellation-free properties. In: Proceedings of the 2012 17th Asia and South Pacific design automation conference (ASP-DAC), pp 541–546 Song Y, Shi G (2012) Hierarchical graph reduction approach to symbolic circuit analysis with data sharing and cancellation-free properties. In: Proceedings of the 2012 17th Asia and South Pacific design automation conference (ASP-DAC), pp 541–546
25.
go back to reference Xu Y, Chu C (2009) GREMA: graph reduction based efficient mask assignment for double patterning technology. In: Proceedings of the IEEE/ACM international conference on computer-aided design (ICCAD), pp 601–606 Xu Y, Chu C (2009) GREMA: graph reduction based efficient mask assignment for double patterning technology. In: Proceedings of the IEEE/ACM international conference on computer-aided design (ICCAD), pp 601–606
26.
go back to reference Henke J, Ernst R (1995) A path-based technique for estimating hardware runtime in Hw/Sw-cosynthesis. In: Proceedings of the 8th international symposium on system synthesis, pp 116–121 Henke J, Ernst R (1995) A path-based technique for estimating hardware runtime in Hw/Sw-cosynthesis. In: Proceedings of the 8th international symposium on system synthesis, pp 116–121
27.
go back to reference Maciel P, Barros E, Rosenstiel W (1997) Computing communication cost by Petri nets for hardware/software codesign. In: Proceedings of the 8th international workshop on rapid system prototyping (RSP’97) shortening the path from specification to prototype, pp 44–56 Maciel P, Barros E, Rosenstiel W (1997) Computing communication cost by Petri nets for hardware/software codesign. In: Proceedings of the 8th international workshop on rapid system prototyping (RSP’97) shortening the path from specification to prototype, pp 44–56
28.
go back to reference Henkel J, Ernst R (1998) High-level estimation techniques for usage in hardware/software co-design. In: Proceedings of the Asia and South Pacific design automation conference (ASP-DAC), pp 353–360 Henkel J, Ernst R (1998) High-level estimation techniques for usage in hardware/software co-design. In: Proceedings of the Asia and South Pacific design automation conference (ASP-DAC), pp 353–360
29.
go back to reference Li Y, Henkel J (1998) A framework for estimating and minimizing energy dissipation of embedded HW/SW systems. In: Proceedings of the 35th design automation conference (DAC), pp 188–193 Li Y, Henkel J (1998) A framework for estimating and minimizing energy dissipation of embedded HW/SW systems. In: Proceedings of the 35th design automation conference (DAC), pp 188–193
30.
go back to reference Junior M, Neto S, Maciel P, Lima R, Ribeiro A, Barreto R, Tavares E, Braga F (2006) Analyzing software performance and energy consumption of embedded systems by probabilistic modeling: an approach based on coloured Petri nets. In: Proceedings of the 27th international conference on applications and theory of Petri nets and other models of concurrency (ICATPN), Turku, pp.261–281 (pp 26–30) Junior M, Neto S, Maciel P, Lima R, Ribeiro A, Barreto R, Tavares E, Braga F (2006) Analyzing software performance and energy consumption of embedded systems by probabilistic modeling: an approach based on coloured Petri nets. In: Proceedings of the 27th international conference on applications and theory of Petri nets and other models of concurrency (ICATPN), Turku, pp.261–281 (pp 26–30)
31.
go back to reference Ye W, Ernst R, Benner T, Henkel J (1993) Fast timing analysis for hardware–software co-synthesis. In: Proceedings of the IEEE international conference on computer design: VLSI in computers and processors (ICCD), pp 452–457 Ye W, Ernst R, Benner T, Henkel J (1993) Fast timing analysis for hardware–software co-synthesis. In: Proceedings of the IEEE international conference on computer design: VLSI in computers and processors (ICCD), pp 452–457
32.
go back to reference Vahid F, Gajski DD (1995) Closeness metrics for system-level functional partitioning. In: Proceedings of the European design automation conference with EURO-VHDL (EURO-DAC), pp 328–333 Vahid F, Gajski DD (1995) Closeness metrics for system-level functional partitioning. In: Proceedings of the European design automation conference with EURO-VHDL (EURO-DAC), pp 328–333
33.
go back to reference Lam SK, Srikanthan T (2009) Rapid design of area-efficient custom instructions for reconfigurable embedded processing. J Syst Archit 55(1):1–14CrossRef Lam SK, Srikanthan T (2009) Rapid design of area-efficient custom instructions for reconfigurable embedded processing. J Syst Archit 55(1):1–14CrossRef
34.
go back to reference Lam SK, Srikanthan T, Clarke CT (2011) Architecture-aware technique for mapping area-time efficient custom instructions onto FPGAs. IEEE Trans Comput 60(5):680–692CrossRefMathSciNet Lam SK, Srikanthan T, Clarke CT (2011) Architecture-aware technique for mapping area-time efficient custom instructions onto FPGAs. IEEE Trans Comput 60(5):680–692CrossRefMathSciNet
35.
go back to reference Chuong LM, Lam SK, Srikanthan T (2009) Area-time estimation of controller for porting C-based functions onto FPGA. In: Proceedings of the IEEE/IFIP international symposium on rapid system prototyping (RSP), pp 145–151 Chuong LM, Lam SK, Srikanthan T (2009) Area-time estimation of controller for porting C-based functions onto FPGA. In: Proceedings of the IEEE/IFIP international symposium on rapid system prototyping (RSP), pp 145–151
36.
go back to reference Zhao K, Bian J (2011) Instruction-level hardware/software partition through DFG exploration. In: Proceedings of the 15th international conference on computer supported cooperative work in design (CSCWD), pp 55–60 Zhao K, Bian J (2011) Instruction-level hardware/software partition through DFG exploration. In: Proceedings of the 15th international conference on computer supported cooperative work in design (CSCWD), pp 55–60
37.
go back to reference Wang D, Li S, Dou Y (2008) Collaborative hardware/software partition of coarse-grained reconfigurable system using evolutionary ant colony optimization. In: Proceedings of the 2008 Asia and South Pacific design automation conference (ASP-DAC), pp 679–684 Wang D, Li S, Dou Y (2008) Collaborative hardware/software partition of coarse-grained reconfigurable system using evolutionary ant colony optimization. In: Proceedings of the 2008 Asia and South Pacific design automation conference (ASP-DAC), pp 679–684
38.
go back to reference Bouchhima A, Gerin P, Petrot F (2009) Automatic instrumentation of embedded software for high level hardware/software co-simulation. In: Proceedings of the 2009 Asia and South Pacific design automation conference (ASP-DAC), pp 546–551 Bouchhima A, Gerin P, Petrot F (2009) Automatic instrumentation of embedded software for high level hardware/software co-simulation. In: Proceedings of the 2009 Asia and South Pacific design automation conference (ASP-DAC), pp 546–551
39.
go back to reference Wolf W (1996) Object-oriented cosynthesis of distributed embedded systems. ACM Trans Des Autom Electron Syst (TODAES) 1(3):301–314CrossRef Wolf W (1996) Object-oriented cosynthesis of distributed embedded systems. ACM Trans Des Autom Electron Syst (TODAES) 1(3):301–314CrossRef
40.
go back to reference Henkel J, Ernst R (2001) An approach to automated hardware/software partitioning using a flexible granularity that is driven by high-level estimation techniques. IEEE Trans Very Large Scale Integr (VLSI) Syst 9(2):273–289 Henkel J, Ernst R (2001) An approach to automated hardware/software partitioning using a flexible granularity that is driven by high-level estimation techniques. IEEE Trans Very Large Scale Integr (VLSI) Syst 9(2):273–289
41.
go back to reference Gibson D, Kumar R, McCurley KS, Tomkins A (2006) Dense subgraph extraction. In: Cook DJ, Holder LB (eds) Mining graph data. Wiley, Hoboken Gibson D, Kumar R, McCurley KS, Tomkins A (2006) Dense subgraph extraction. In: Cook DJ, Holder LB (eds) Mining graph data. Wiley, Hoboken
42.
go back to reference Cuzzocrea A, Song IY (2014) Big graph analytics: the state of the art and future research agenda. In: Proceedings of 17th international workshop on data warehousing and OLAP. ACM, pp 99–101 Cuzzocrea A, Song IY (2014) Big graph analytics: the state of the art and future research agenda. In: Proceedings of 17th international workshop on data warehousing and OLAP. ACM, pp 99–101
43.
go back to reference Quamar A, Deshpande A, Lin J (2014) NScale: neighborhood-centric analytics on large graphs. Proc VLDB 7(13):1673–1676CrossRef Quamar A, Deshpande A, Lin J (2014) NScale: neighborhood-centric analytics on large graphs. Proc VLDB 7(13):1673–1676CrossRef
44.
go back to reference Huan J, Wang W, Prins J, Yang J (2004) Spin: mining maximal frequent subgraphs from graph databases. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 581–586 Huan J, Wang W, Prins J, Yang J (2004) Spin: mining maximal frequent subgraphs from graph databases. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 581–586
45.
go back to reference Youness H, Hassan M, Sakanushi K, Takeuchi Y, Imai M, Salem A, Wahdan AM, Moness M (2009) A high performance algorithm for scheduling and hardware–software partitioning on MPSoCs. In: Proceedings of the 4th international conference on design and technology of integrated systems in nanoscale era (DTIS’09), pp 71–76 Youness H, Hassan M, Sakanushi K, Takeuchi Y, Imai M, Salem A, Wahdan AM, Moness M (2009) A high performance algorithm for scheduling and hardware–software partitioning on MPSoCs. In: Proceedings of the 4th international conference on design and technology of integrated systems in nanoscale era (DTIS’09), pp 71–76
46.
go back to reference Han H, Liu W, Wu J, Jiang G (2013) Efficient algorithm for hardware/software partitioning and scheduling on MPSoC. J Comput 8(1):61–68CrossRef Han H, Liu W, Wu J, Jiang G (2013) Efficient algorithm for hardware/software partitioning and scheduling on MPSoC. J Comput 8(1):61–68CrossRef
Metadata
Title
Algorithmic aspects of graph reduction for hardware/software partitioning
Authors
Guiyuan Jiang
Jigang Wu
Siew-Kei Lam
Thambipillai Srikanthan
Jizhou Sun
Publication date
01-06-2015
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 6/2015
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-015-1381-4

Other articles of this Issue 6/2015

The Journal of Supercomputing 6/2015 Go to the issue

Premium Partner