In the digital era, massive growth in Wireless Multimedia Sensor Networks (WMSNs) can be used in various applications such as environmental and industrial monitoring, disaster management. WMSN comprises many low-power camera sensor nodes that may have overlapping field of view for enhancing the accuracy of the monitored environment. Hence, huge volume of redundant data which make it inefficient in terms of network lifetime and energy. It needs to establish a tremendous suitable approach to communicate images over WMSN that handle critical factors such as maximize QoS and minimize energy consumption efficiently. There is a need for energy efficient image processing and transmission over WMSN. So aggregating multi-view images using cluster based distributed routing approaches with QoS constraints have received significant research interest. The Distributed Two-Layer Cluster framework is used in this paper for transmitting aggregated data by local cluster head (LCH) and master cluster head from various clusters to base station in multi-hop basis. This aggregated data are generated from multi-view images captured by correlated camera sensor nodes of local cluster (LC) and master cluster (MC) using first level data aggregation (FLDA) and second level data aggregation (SLDA) algorithms respectively. In FLDA, less complex novel protein sequence alignment based patch matching algorithm is used to reduce inter-view redundancy of multi-view images. Then, this first level aggregated data transmission of LC is reduced using Varying Bit Encoding based on Arithmetic Operations algorithm by combining multi-view images together and achieves its compression ratio of about 2.54. In SLDA, Higher-Order SVD is used to aggregate FLDA data from LCH for reducing MC transmission rate and achieves its compression ratio of about 9.1. The comparison of network performance results with existing cluster based routing algorithms exhibit the improvement of the proposed system in terms of performance metrices such as energy consumption, packet delivery ratio, end-to-end delay and network lifetime. This results show that PSA based TLDA performs better and average energy consumption of processing and transmitting multi-view images per round is considerably minimized to 0.320 J. Thus the proposed system prolongs the network lifetime by 35%, 6% when compared with DFRP and GRAD-CORR based TLDA in terms of number of rounds.