nach oben

Clean Technologies and Environmental Policy

Open Access 21.03.2024 | Original Paper

Detailed analysis of Türkiye's agricultural biomass-based energy potential with machine learning algorithms based on environmental and climatic conditions

verfasst von: I. Pence, K. Kumas, M. Siseci Cesmeli, A. Akyüz

Erschienen in: Clean Technologies and Environmental Policy

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

In the study, the biomass and energy potential of each province of Türkiye was calculated for the years 2010–2021, using data from 15 different fields and 16 different horticultural crops. The total theoretical energy potential obtained from field and garden products was calculated as 222,620 Terajoule (TJ) and 61,737 TJ for 2010 and 308,888 TJ and 77,002 TJ for 2021, respectively. The agricultural biomass potential for 2021 was estimated using machine learning algorithms, depending on the environmental and climate data covering 2010–2020, which has not been studied in the literature. In this study, agricultural biomass potential for Türkiye was tried to be modeled by using Random Forest, K-Nearest Neighbors (KNN), Gradient Boosting, and eXtreme Gradient Boosting Regressor (XGBR) from machine learning methods. Agricultural biomass potential was tested in a tenfold cross-validation analysis and prediction for 2021 using only climatic and agricultural area data. In addition, by applying feature selection, it has been tried to reduce the features to be used and increase the success rate. Accordingly, when the results of the Random Forest algorithm were generalized, it achieved an R² value of 0.9328 using all features for the tenfold cross-validation analysis. At the same time, it reached an R² value of 0.9434 using four features in the prediction of 2021 and was found to be successful. Considering only the 2021 forecast, the KNN algorithm reached the highest result with an R² value of 0.9560 using only four features. Also, the Wilcoxon rank-sum test result at p = 0.05 shows no significant difference between the predictions and the actual values.

Graphical abstract

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Availability

AEP

Available energy potential

Annual yield

European Union

GHG

Global greenhouse gas

GIS

Geographic information system

KNN

K-nearest neighbors

LHV

Lower heating value

Moisture content

MAE

Mean absolute error

R ²

Coefficient of determination

RMSE

Root-mean-square error

RPR

Product residue ratio

SGD

Stochastic gradient descent

TBP

Theoretical biomass potential

TEP

Theoretical energy potential

Terajoule

TUIK

Turkish Statistical Institute

XGBR

Extreme gradient boosting regressor

Introduction

Energy is needed at every stage of modern life. Energy consumption is an essential parameter in determining a country's development level. Energy consumption in developed countries is higher than in developing countries (Khanlari et al. 2020; Can 2022; Ozturk et al. 2017). With the development of technology, the energy demand of developing countries is increasing rapidly to meet the needs. Energy use is both a cause and a consequence of economic growth and development. Energy is essential for most economic activities. Fossil fuels are the primary source of energy for worldwide energy needs (Sayin et al. 2005; Jayarathna et al. 2020). These energy sources' production, transportation, and use cause high emissions, damage to the atmosphere, and climate change now and in the future (Pence et al. 2023; Kaygusuz 2010). Climate change has become an essential global problem related to energy, the economy, the environment, and technology. Significant steps must be taken to reduce global greenhouse gas (GHG) emissions (Barbera et al. 2019; D'Adamo et al. 2019; Zheng et al. 2019). For this reason, it is necessary to determine greenhouse gas emission management in countries subject to international commitments, agreements, and national policies. In this context, it is aimed to increase the share of renewable energy resources with some agreements and protocols to prevent emissions worldwide (Zheng et al. 2019; Chang and Hu 2019). In the 2015 Paris Agreement, which Türkiye also signed, it was decided to reduce CO₂ emissions by 45% compared to 2010 and zero them by 2050.

The European Union (EU) aims to increase the share of renewable energy to at least 32% by 2030, reduce greenhouse gas emissions by 55%, and increase energy efficiency by at least 32.5% (Guler et al. 2022; Filipović et al. 2022). Depending on this decision, reducing emissions caused by energy consumption, which is one of the essential parameters in emissions, is crucial. The economy of Türkiye and industry organization in this direction will strengthen the country's adaptation to the European Green Deal (Tanasa et al. 2020; Cekinir et al. 2022). Increasing renewable energy sources in all sectors is vital to reducing greenhouse gas emissions, especially in industry. Increasing the use of energy resources such as hydroelectric, wind, solar, and biomass will be beneficial in preventing emissions by taking steps toward energy independence and security (Tumen Ozdil and Caliskan 2022).

The largest share of electricity in Türkiye is obtained from thermal power plants that consume natural gas, oil, and imported coal. Although lignite and coal have an essential potential, oil and natural gas reserves are almost negligible compared to world reserves (Erat et al. 2021; Telli et al. 2021). While production increased by 45.21% between 2010 and 2020, it increased by 17.20% between 2015 and 2020. Türkiye's electricity production in 2020 is 306,703 GWh. The use rate of renewable energy in electricity generation is approximately a 3.7% annual average increase from 2012 to 2020, 7.8% in hydraulics, 20.50% in wind, 36% in geothermal, and 31% in biomass (Pence et al. 2023; Ocak and Acar 2021). On the other hand, according to Türkiye's renewable energy targets, renewable energy sources are expected to meet 20% of total energy consumption and 30% of electricity production by 2023 (Pence et al. 2023; Rincon et al. 2019).

Türkiye's electricity demand is expected to reach 424 TWh in 2023, and it is predicted that the demand will be approximately five times higher in 2030 compared to 2000 (Melikoglu and Menekse 2020; Yurtkuran 2021). Türkiye has great potential in renewable energy. Türkiye's geographical location has several advantages for the widespread use of most renewable energy sources. Due to these advantages, it is a suitable country for energy production from renewable sources such as wind, solar, hydroelectricity, and biomass. Hydroelectric power plants come to the fore in Türkiye among these energy sources. As of the middle of 2022, there are 750 hydroelectric power plants in Türkiye (Şenol et al. 2030; Bakay and Ağbulut 2021). Türkiye's total installed power as of the end of 2023 is 106,071 MW. About 55.3% of the power plants in operation are power plants that produce electricity from renewable sources. Türkiye's total electricity installed power consists of thermal power plants (44.7%), hydroelectric power plants (29.8%), wind (11%), solar energy (10.6%), biomass (2.3%), and geothermal (1.6%) (TETC 2024).

Biomass is one of the important renewable energy sources with many uses worldwide. Municipal waste, vegetable oil waste, agricultural waste, forest product waste, industrial waste sludge, and sewage sludge are sources of biomass (Guler et al. 2022). This energy source has many advantages over other renewable energy sources. Some of these advantages are that it is easily obtained from organic materials, easy to obtain energy, and has many uses, such as home and industrial sectors that concern a large part of society (Ozturk et al. 2017; Channi et al. 2022; Jayarathna et al. 2022). Biomass can be burned directly or converted into solid, gaseous, and liquid fuels with the help of conversion technologies such as fermentation to produce alcohol, anaerobic digestion to produce biogas, and gasification to produce natural gas substitutes (Samadi et al. 2020). From an economic and technical point of view, it is one of the most sustainable energy sources for the country. This resource is available in a stored form and can increase employment opportunities in rural areas. It can help to reduce the trade deficit by reducing the dependence of developing countries like Türkiye, which imports a large part of its energy needs, on energy imports (Knápek et al. 2020). Due to these advantages, biomass stands out compared to other renewable energy sources, and it is accepted that biomass is more efficient in terms of economic and technical feasibility (Cekinir et al. 2022; Asghar et al. 2022). Biomass energy can be converted into thermal and electrical energy. For these reasons, energy policies and plans that ensure food safety worldwide should be supported in using biomass for energy purposes locally, nationally, and globally (Toklu 2017; Singh 2016).

Aiming to use bioenergy in Türkiye, at the First Agricultural Congress in 1931, it was discussed that the fuels needed for agricultural machinery would be produced with domestic resources instead of imports. It was emphasized that the fuels obtained using local resources benefit the national economy, and the importance of biofuel production was discussed in various dimensions. The first official document regarding biofuels in Türkiye was signed in 1934. Biogas production, one of the bioenergy sources, was initiated by the Soil and Water Research Institute in the 1950s. In the 1960s, pilot facilities were established within the State Production Farms, and eight biogas facilities were established by the Eskişehir Soil Water Research Institute affiliated with the Ministry of Agriculture. However, the work has ended due to a lack of technical personnel and inadequate training of farmers. Especially with the oil crisis in the early 1980s, studies on establishing biogas units increased. After 1980, studies on biogas production in Türkiye gained momentum, and the work that started with establishing a 35 m³ facility in Muş Province expanded to establishing approximately 1000 facilities with state support. Studies in this field in Türkiye continued increasingly after 2000. By the end of 2023, there are a total of 212 bioenergy facilities in Türkiye, 199 of which produce electrical energy from bioenergy, eight of which produce biodiesel, and five of which produce bioethanol (RTMENR 2024; Kumaş et al. 2019; Hatunoğlu 2010).

Biomass residues are directly linked to crop yield at all stages of agricultural production. Excess product production results in excess residues since residues constitute a certain percentage of the crop. During crop and fruit production in agriculture, many residues are obtained from field and horticultural crops. Biomass residues are the stems, straws, stems, leaves, branches, etc., after harvesting the main crop in agriculture, which are the remnants left by cutting and pruning (Tumen Ozdil and Caliskan 2022; Güney and Kantar 2020; Avcıoğlu et al. 2019). Biomass energy potential can be calculated based on these parameters. The potential also depends on environmental factors such as crop yield and biomass residues and their agronomic development, climatic conditions, and soil structure. For this reason, even if the same amount of agricultural product is obtained, different amounts of agricultural product residues used as biomass are obtained in different countries (Zafar et al. 2021; Balsalobre-Lorente et al. 2019; Aydin 2019). Türkiye shows different characteristics depending on its geographical location and landforms. Therefore, it has a wide variety of agricultural products (Senocak and Guner 2022). Different indicators such as climatic data, harvest time, and geographical conditions affect the amount of agricultural products. In recent years, agricultural areas have also been affected by the instantaneous effect of meteorological events, which causes climate change (Avcıoğlu et al. 2019; Zheng and Qiu 2020). In the agricultural sector, the climatic characteristics of that region are primarily considered when making construction and operation plans and production plans.

Climate and meteorological data are the absolute guiding factors in selecting the plant variety to be grown, tillage, planting, pruning, hoeing, irrigation, fertilization, spraying, harvesting, and micro-climatological environment planning. Climatic factors (temperature, relative humidity, dew, fog, precipitation, cloudiness, light, wind, snow, and frost) affect agricultural activities and cause severe problems if various applications are not made according to climatic factors (RTMAF 2022).

In recent years, studies have been carried out in Türkiye on using biomass resources such as hazelnut shells, agricultural waste, wheat straw, tea waste, and olive peel for energy purposes. It is imperative to focus on efficient production to meet the increasing energy demand and use of biomass energy to meet both traditional and modern fuel requirements (Yurtkuran 2021; Balat 2005). Studies have been carried out on the potential of biomass resources in the world and Türkiye. These studies, which investigated different results for calculating agricultural biomass residues and potentials, are as follows.

Singh (2016) developed a method for estimating India's biomass power potential from agricultural waste. It has been determined that 650 Mt of agricultural biomass is obtained annually in the country, and 1/3 of it can be used. It is stated that the equivalent of this biomass has an energy potential of 3.72 EJ, which is approximately equal to 23–35 GW of electrical power (Singh 2016).

On the other hand, Ozturk et al. (2017) compared the renewable energy and biomass potential by showing the future energy scenarios of Türkiye and Malaysia. By evaluating the biomass presence of the countries, their possible contributions to the economy of both countries were evaluated (Ozturk et al. 2017). Toklu (2017) examined Türkiye's biomass potential for different sources in 2010. The study stated that the total biomass energy potential of Türkiye is approximately 33 Mtoe, while the usable biomass potential is approximately 17 Mtoe (Toklu 2017). Bilandzija et al. (2018) examined Croatia's agricultural waste's biomass and energy potential. Biomass potentials of 3050.3 t, 1441.8 t, and 733.68 t were determined for different scenarios. The study calculated that 51.14 PJ, 24.06 PJ, and 12.18 PJ energy potentials could be obtained depending on this potential (Bilandzija et al. 2018). Ma et al. (2018) used the artificial neural network method to model and predict the production and consumption values of biomass and hydroelectric energy resources in the USA. In the study, data between 2009 and 2016, as well as LSTM and RNN estimation algorithms, were used (Ma et al. 2018). Avcıoğlu et al. (2019) investigated agricultural wastes' biomass and energy potential in Türkiye. The total amount of biomass obtainable from field and garden plant wastes was calculated as 9432 kt and 15,652 kt, respectively. The theoretical energy potential based on the amount of biomass was estimated as 908,119 Terajoule (TJ) from field crops and 90,354 TJ from horticultural crops, respectively (Avcıoğlu et al. 2019). Moustakas et al. (2020) investigated the biomass potential of the Thessaly Region, Greece. In the study, the biomass potential from agricultural waste is approximately 707,164 tons/year, as per the data. It has been stated that with the potential of biomass, a maximum of 619 GWh of electricity and 895 GWh of heat can be obtained per year (Moustakas et al. 2020). Knápek et al. (2020) examined the biomass potential of the Czech Republic for different scenarios. The GIS (geographic information system) method was used to determine the potential, and it was stated that the biomass potential would increase by 35% in energy production with the planning of the arable land. Considering the year 2040, it was stated that the biomass potential could increase by 42 PJ in total (Knápek et al. 2020). Samadi et al. (2020) estimated Iran's energy production by gasification technology of agricultural waste. It is stated that the total energy obtained from agricultural residues by gasification is 341.29 TJ, and the amount of electricity and heat is 66,075 and 399,112 TJ, respectively (Samadi et al. 2020). Tumen Ozdil and Caliskan (2022) determined the biomass potential and the associated reproducible electrical energy from agricultural wastes in Türkiye between 2008 and 2018. The total theoretical average biomass potential obtained from its plants was calculated as 522,875 kt and 51,359 kt, respectively. The potential of electricity produced from agricultural field crops is 994 × 10⁹ kWh (Tumen Ozdil and Caliskan 2022). Senocak and Guner (2022) estimated the amount of animal and agricultural waste expected for the coming years and the energy potential for the Acıpayam district of Denizli, Türkiye, using artificial intelligence. In addition, spatial analyses and different scenario evaluations were carried out using GIS (Senocak and Guner 2022).

Due to climate and land conditions, Türkiye has an essential geographical position regarding energy crop cultivation. Although it has a high potential in terms of biomass resources potential and the energy that can be obtained from these resources, the desired levels have not been reached in terms of installed power in the country (Pence et al. 2023; Şenol et al. 2030; Avcıoğlu et al. 2019). Determining the amount and distribution of biomass resources as accurately as possible is crucial in making strategic decisions such as energy management policies (Tumen Ozdil and Caliskan 2022; Toklu 2017; Avcıoğlu et al. 2019). According to the literature studies, generally known basic statistical approaches were used to estimate the renewable energy potential in Türkiye. Adopting a variety of scenario approaches that also take into account uncertain factors can lead to better results. In addition, reliable, analytical, and flexible estimation methods are needed to determine the energy potential of biomass. By adopting such a systematic, integrated approach, energy management decisions can be made more effectively. Unlike the literature, this study proposes a decision support method that enables estimating the energy potential of the resources produced from biomass.

A machine learning algorithm is a subset of artificial intelligence that learns from data to improve performance on a given task without being explicitly programmed. A machine can learn from experience and improve its performance over time by recognizing patterns and making decisions based on them. Machine learning can be classified into supervised, unsupervised, semi-supervised, and reinforcement learning. In supervised learning for regression, a model is trained to predict a continuous numerical output based on a set of input features. In supervised learning, the model is provided with training examples that include both the input features and the corresponding target values, and the goal is to build a mapping function that can accurately predict the target value based on new, unseen input values (Goodfellow et al. 2016).

Image and speech recognition, natural language processing, recommendation systems, and predictive modeling are just a few of the many uses for machine learning algorithms. They are incredibly well suited to manual tasks that are too difficult or time-consuming for humans to complete. Machine learning can be used to develop models that can accurately predict the energy equivalent of the theoretically calculated biomass potential using the waste rate, production amount, average moisture of each crop, sub-calorific value, and percentage availability of each field and orchard crop. These machine learning models can be trained on large datasets of various climate data, agricultural land, and the corresponding biomass potential data, allowing them to learn the relationships between them. The models can save time and resources for direct measurements or calculations by being trained to predict biomass potential in new areas or at various times.

Türkiye is geographically located between the European and Asian continents. Due to its location, the country has climate transitions, different land structures, and diverse agricultural products. No study has been found in the literature that includes long-term and up-to-date data for countries with such characteristics. When the literature is examined, it is noted that there are few studies on agricultural biomass. Generally, only theoretical biomass calculations have been made in the literature, and other factors affecting it have not been examined. This study tried to close this gap in the literature by incorporating climate data, the effects of which have not been examined before, into machine learning models for agricultural biomass potential estimation.

Regression models are used to predict the value of a dependent variable. It is essential to use these algorithms to predict the value of a dependent variable and explain the relationship between variables. It was assumed that the amount of agricultural biomass potential depends on independent variables such as climate data, and modeling with regression algorithms was preferred for this purpose. Among the regression models, Random Forest, K-Nearest Neighbors (KNN), and Gradient Boosting algorithms, known as state-of-the-art in the literature, are widely used. The eXtreme Gradient Boosting Regressor (XGBR), which has become popular with its successful results in recent years, also stands out.

In this study, the agricultural biomass potential of each province in Türkiye for the years 2010–2021 was modeled using various climate data and agricultural land. Random Forest, KNN, Gradient Boosting, and XGBR, popular machine learning algorithms in the literature, were used for modeling.

The novelty and contribution are: (1) Agricultural biomass potential has been modeled with high success by machine learning methods using climate data; (2) with the feature selection, the prediction success was increased by using only the agricultural area, temperature, biomass type, and humidity; and (3) considering many years in terms of agricultural biomass, a model has been established for Türkiye.

This study consists of four parts. The first part examined literature studies in the world and Türkiye. In the second part, the mathematical calculation of the theoretical biomass potential and the methods to be used in estimation have been explained in detail. The findings obtained from the mathematical method and estimation were compared in the third part. Finally, in the fourth part, the results and recommendations were given.

Materials and methods

Türkiye is geographically located between 36° and 42° north latitudes and 26°–45° east meridians. It is located at an important point connecting the continents of Europe and Asia like a bridge. The surface area of Türkiye is known as 814,578 km². Türkiye's population reached 84.5 million in 2022 (Pence et al. 2023). Mediterranean, Black Sea, and continental climate types are seen in Türkiye. The summer months are hot and dry, and the winter months are mild and rainy. Mediterranean climate is seen on the Aegean and Mediterranean coasts. In the Black Sea climate, it is rainy in all seasons. The continental climate is observed in the inner parts of Türkiye. Wheat, barley, and corn are the most grown products in Türkiye. In addition, products such as cotton, flax, sesame, and poppy, which have high economic returns, have been grown for a long time. Soybeans are grown in the Mediterranean region. A wide variety of fruits are grown in many parts of Türkiye. Therefore, various agricultural residues are a source of biomass energy (Can 2022; Tumen Ozdil and Caliskan 2022; Şenol et al. 2030). In Türkiye, in the second half of 2022, cereal and other herbal products will increase by approximately 14% compared to 2021, while fruits, beverages, and spice plants will increase by 3%. Production amounts 2022 were approximately 70 million tons in cereals and other herbal products, 32 million tons in vegetables, and 26 million tons in fruits, beverages, and spice plants. Cereal production volumes increased by 21% in 2022 compared to 2021. When the past 2 years are compared, wheat production increased by 12% to 19.8 million tons, and corn production increased by 23% to 8.3 million tons. In addition, barley production increased by 48% to 8.5 million tons, rye production increased by 37% to 273 thousand tons, and oat production increased by 32% to 365 thousand tons. The production of fruits, beverages, and spice plants increased by 3.8% in 2022 compared to the previous year and became approximately 25.8 million tons. Compared to 2021, fruit products increased by 5% in apples, 13% in grapes, 12% in peaches and nectarines, 8% in plums, 8% in strawberries, and 71% in olives. However, there was a decrease of 17% in tangerines, 31% in oranges, and 32% in lemons. There was an increase of 11.8% in hazelnuts, 99% in pistachios, 9% in figs, and 13% in bananas. Bean production decreased by 11.5% to 270 thousand tons, soybean production decreased by 15% to 155 thousand tons, and sunflower production increased by 5% to approximately 2.6 million tons. Tobacco production increased by 15% to 82 thousand tons, and sugar beet production increased by 7% to 19 million tons (TUIK 2022).

Data collection and theoretical calculation of biomass potential energy

In this study, the production amount of different field and garden products used for biomass production between 2010 and 2021 was taken from the Turkish Statistical Institute (TUIK). It was examined in two categories: 15 different plants from field products and 16 different plants from garden products. The total production amounts of the field and garden products used in the study, selected as an example, in Türkiye for the past 5 years are given in Table 1 (TUIK 2022).

Table 1

Production amount of field and garden products in 2017–2021 (tons/year)

Agricultural crops	2017	2018	2019	2020	2021
Wheat	21,500,000	20,000,000	19,000,000	20,500,000	17,650,000
Maize	5,900,000	5,700,000	6,000,000	6,500,000	6,750,000
Rye	320,000	320,000	310,000	295,681	200,000
Beans	239,000	220,000	225,000	279,518	305,000
Soybean	140,000	140,000	150,000	155,225	182,000
Groundnut	165,330	173,835	169,328	215,927	234,167
Canola or rapeseed seed	60,000	125,000	180,000	121,542	140,000
Sunflower	1,964,385	1,949,229	2,100,000	2,067,004	2,415,000
Rice	900,000	940,000	1,000,000	980,000	1,000,000
Sugar beet	21,150,900	17,439,087	18,056,661	23,028,285	17,768,837
Tobacco	93,666	75,275	68,223	79,081	71,497
Cotton	2,450,000	2,570,000	2,200,000	1,773,646	2,250,000
Barley	7,100,000	7,000,000	7,600,000	8,300,000	5,750,000
Grapes	4,200,000	3,933,000	4,100,000	4,208,908	3,670,000
Banana	369,009	498,888	548,323	728,133	883,455
Fig	305,689	306,499	310,000	320,000	320,000
Grapefruit	260,000	250,000	249,185	238,012	249,000
Lemon	1,007,133	1,100,000	950,000	1,188,517	1,550,000
Oranges	1,950,000	1,900,000	1,700,000	1,333,975	1,742,000
Mandarin	1,550,469	1,650,000	1,400,000	1,585,629	1,819,000
Apples	3,032,164	3,625,960	3,618,752	4,300,486	4,493,264
Pear	503,004	519,451	530,723	545,569	530,349
Apricot	985,000	750,000	846,606	833,398	800,000
Cherry	627,132	639,564	664,224	724,944	689,834
Peach	771,459	789,457	830,577	892,048	891,857
Almond	90,000	100,000	150,000	159,187	178,000
Hazelnut	675,000	515,000	776,046	665,000	684,000
Pistachio	78,000	240,000	85,000	296,376	119,355
Olive	2,100,000	1,500,467	1,525,000	1,316,626	1,738,680

Different fields and plant species' necessary structural and physical properties were determined to obtain energy from biomass residues. These properties are residue-product ratio, residual moisture, and energy value. According to the information obtained from the literature, the amount of product, residual moisture content, product residue ratio, bottom heating value, and usability rates for field and horticultural crops accepted in the mathematical calculation are given in Table 2 (Tumen Ozdil and Caliskan 2022; Avcıoğlu et al. 2019).

Table 2

Parameters used for field crops and horticultural crops calculations

Agricultural crops	Residue types	Moisture (M) (%)			Ratio of product residue (RPR)			Lower heating value (LHV) (MJ/kg)			Availability (A) (%)
Agricultural crops	Residue types	Min	Max	Avg	Min	Max	Avg	Min	Max	Avg	Availability (A) (%)
Wheat	Straws	10	15	13	0.5	1.75	1.13	13.9	19.5	16.7	15
Maize	Stalks	15	17	16	1.5	2.25	1.88	15.5	18.5	17	60
	Cobs	7	9	8	0.27	0.86	0.57	12.6	18.4	15.5	60
Rye	Straws	15	15	15	0.99	0.99	0.99	17.4	17.4	17.4	15
Beans	Stems-leaves	5	5	5	1.4	1.5	1.45	14.7	14.7	14.7	15
Soybean	Straws	15	15	15	0.76	3.5	2.13	14.9	19.4	17.2	60
Groundnut	Shells	8	8	8	0.2	0.52	0.36	11.2	16.9	14.1	80
	Straws-haulms	15	15	15	2.1	2.3	2.2	14.4	15.2	14.8	80
Canola or Rapeseed seed	Stalks	25	25	25	1.6	1.8	1.7	17.1	17.1	17.1	15
Sunflower	Stems-leaves	14	40	27	0.7	3.5	2.1	13.2	17.9	15.6	60
Rice	Straws	10	25	18	0.45	1.75	1.1	8.8	16	12.4	60
	Husks	10	13	12	0.2	0.27	0.24	12.9	19.9	16.4	80
Sugar beet	Leaves	75	75	75	0.12	0.14	0.13	15.5	17.7	16.6	15
Tobacco	Stems	85	85	85	2.27	2.27	2.27	16.1	16.1	16.1	60
Cotton	Stalks	6	12	9	1.1	3.5	2.3	14.6	18.2	16.4	60
Barley	Straws	11	15	13	1.08	1.36	1.22	17.5	19.5	18.5	15
Grapes	Pruning	40	50	45	0.39	0.45	0.42	16.8	19.2	18	80
Banana	Stalk-peels	85	85	85	2	2	2	13.1	13.1	13.1	80
Fig	Pruning	55	55	55	0.21	0.21	0.21	18	18.4	18.2	80
Grapefruit	Pruning	40	40	40	0.11	0.11	0.11	17.6	17.6	17.6	80
Lemon	Pruning	35	45	40	0.19	0.4	0.3	17.6	17.6	17.6	80
Oranges	Pruning	35	45	40	0.2	0.5	0.35	17.6	18.5	18.1	80
Mandarin	Pruning	35	45	40	0.17	0.4	0.29	17.6	17.6	17.6	80
Apples	Pruning	40	40	40	0.19	0.19	0.19	17.8	17.8	17.8	80
Pear	Pruning	35	40	38	0.14	0.3	0.22	18	18.4	18.2	80
Apricot	Pruning	40	40	40	0.19	0.19	0.19	19.3	20.8	20	80
Cherry	Pruning	40	40	40	0.19	0.19	0.19	21.7	21.7	21.7	80
Peach	Pruning	35	45	40	0.3	0.5	0.4	18	18.4	18.2	80
Almond	Pruning	35	40	38	0.6	0.61	0.6	18	18.4	18.2	80
Hazelnut	Pruning	40	40	40	3.34	3.34	3.34	19	19	19	80
Pistachio	Pruning	35	35	35	0.4	0.48	0.44	18	19	18.5	80
Olive	Pruning	35	45	40	1.14	1.25	1.2	18.1	18.8	18.5	50

Considering the residue types (stalk, stem-leaf, straw, bark, pruning, etc.) belonging to field and garden products and their relative moisture content, residue production rates, and sub-calorific values, separate usable energy potential values were calculated for each of the 81 provinces in Türkiye. Knowing the areas where the crops are grown, the type and characteristics of residues, and their energy capacity are essential in choosing the location of possible biomass power plants to be established and the sustainability of energy supply. Unlike the studies conducted in Türkiye and the world, in this study, modeling of agricultural biomass potential was carried out using machine learning algorithms, depending on the climate data that are not available in the literature.

Theoretical biomass energy potential has been analyzed with a similar formulation (Eqs. 1–3) in the literature (Guler et al. 2022; Tumen Ozdil and Caliskan 2022; Avcıoğlu et al. 2019; Singh 2015). This calculation method in the literature was used in this study.

Theoretical biomass potential (TBP) was calculated from Eq. (1), theoretical energy potential (TEP) from Eq. (2), and available energy potential (AEP) using Eq. (3). In the equations, CP(i) is the annual yield (tons); RPR(i) is product residue ratio (%); M(i) is moisture content (%); LHV(i) indicates the lower heating value (MJ/kg); and A(i) is availability. Availability is considered in the part converted into energy in the final product. Availability refers to the percentage of theoretical energy produced from waste facilities. The product residue ratio, moisture content, heating value, and usability percentage for each field and garden product included in the equations are taken from Table 2 (Tumen Ozdil and Caliskan 2022; Singh 2015; Riva et al. 2014; Hiloidhari and Baruah 2011). Theoretical energy potential mapping of field and horticultural crops is shown in Fig. 1.

$${\text{TBP}} = \mathop \sum \limits_{i = 1}^{n} CP_{\left( i \right)} \times {\text{RPR}}_{\left( i \right)} \times \left[ {\frac{{100 - M_{\left( i \right)} }}{100}} \right]$$

(1)

$${\text{TEP}} = \mathop \sum \limits_{i = 1}^{n} {\text{TBP}}_{\left( i \right)} \times {\text{LHV}}_{\left( i \right)}$$

(2)

$${\text{AEP}} = \mathop \sum \limits_{i = 1}^{n} T{\text{EP}}_{\left( i \right)} \times A_{\left( i \right)}$$

(3)

Machine learning processes

The theoretical calculation of the energy equivalent of the biomass potential is a complicated process because it needs specific information about each product in the vegetable and grain category, and this information is difficult to obtain for different geographies. Therefore, determining the energy equivalent of biomass potential will be easier to calculate using more readily available climate data for geographic locations rather than waste rate, amount of production, average moisture of each crop, sub-calorific value, and percentage availability of each field and orchard crop. Machine learning algorithms were used to establish such a model. While creating the model, there are some essential processes, from the data preprocessing stage to the prediction output at the end of the training and testing of the model. The flowchart of the processes carried out in this study is shown in Fig. 2.

For 2010–2021 in Türkiye, the energy equivalent of the biomass potential of the products in each vegetable and grain category in all provinces was theoretically calculated, and a dataset was created. While the energy equivalents of the biomass potential constitute the target value, the input values are agricultural area, biomass type (vegetable or grain), year information, temperature, maximum temperature, minimum temperature, humidity, wind speed, precipitation, soil temperature at 5 cm, and sunshine duration. Climatic data were obtained from the Turkish State Meteorological Service (TSMS. Turkish State Meteorological Service 2022). The dataset contains a total of 1944 samples, including 972 vegetables and 972 grains, from 81 provinces for 12 years. In total, 1782 are from 2010 to 2020, and 162 are from 2021. Data specification provides an overview of the dataset, including its source, format, variables, quality, and usage. Table 3 provides the data specification of the dataset used in this study.

Table 3

The data specification of the dataset used in this study

Parameter	Description
Data source	Agricultural data from TUIK
Data source	Meteorological data from the Turkish State Meteorological Service
Data format	Excel file
Temporal resolution
Yearly averages	Data have been aggregated into yearly averages
Date range	From 2010 to 2021
Data size	[1944 × 12]
Target variable	Agricultural biomass potential (Terajoule)
Input variables
Agricultural area	Area of land dedicated to agriculture (hectares)
Biomass type	Type of biomass (vegetable or grain)
Year	Year of the observation
Temperature	Temperature in degrees Celsius
Maximum temperature	Maximum temperature in degrees Celsius
Minimum temperature	Minimum temperature in degrees Celsius
Humidity	Humidity in percentage
Wind speed	Wind speed in meters per second
Precipitation	Precipitation in millimeters
Soil temperature at 5 cm	Soil temperature at 5-cm depth in degrees Celsius
Sunshine duration	Duration of sunshine in hours
Data quality	The dataset has been processed to handle missing values and data in a different range. The null data have been changed to 0 and normalized to the dataset
Usage	The dataset can be used to predict the agricultural biomass potential based on the input variables such as agricultural area, biomass type, meteorological factors, and year information

Preprocessing the data

Cleaning, transforming, and normalizing the data may be necessary to ensure it is in a suitable format before training a machine learning model. Tasks like scaling the data to a standard range or filling in missing values may be involved. Due to geographical conditions and other reasons, some products in the category of vegetables or grains are not grown in some provinces. Therefore, the generated dataset contains some null values. In the preprocessing stage, the null values in the dataset were changed to 0. Then, normalization was performed since each input value contains data in a different range. By converting a variable's values into standard scores, also called z-scores, the variable's value can be normalized using the z-score normalization technique. The mean of each variable is subtracted from it, and the result is then divided by the variable's standard deviation to accomplish this. Z-scores are the names given to the resulting values. Z-score normalization allows for comparing various variables for feature selection because the input variables are converted to the same scale.

Validation

The data should be divided into training sets and test sets after preprocessing. The training set will be used to train the machine learning model, while the test set will be used to evaluate its performance. It is important to evaluate the machine learning model's performance on a separate dataset not used for training to assess its generalizability and avoid overfitting. This can be done using various techniques, such as k-fold cross-validation or time-series datasets. In this study, while tenfold cross-validation was used for 2010–2020, only 2021 was used as test data in the form of time series.

A technique for assessing the effectiveness of a machine learning model is tenfold cross-validation. It entails dividing the dataset into tenfold and testing the model on the remaining fold after training it on 9 of them. A different fold is used as the test set for each ten times this process is repeated. The model's performance is then estimated using the average performance across all ten iterations. Since each observation is only used once during testing, tenfold cross-validation enables the model's performance to be assessed on a larger portion of the dataset. Because the model is tested over a broader range of data, this can provide a more accurate estimate of the model's performance. In this study, by selecting the random_state = 42 parameter in the Scikit-learn library, the same test data were used for all validations in all machine learning models.

Feature selection

A subset of pertinent features is chosen from a more extensive set of dataset features as part of the feature selection process. It can enhance the model's performance, interpretability, complexity, and capacity for generalization. The model's complexity can be decreased. It can also help reduce the risk of incorporating noise or irrelevant features into the model, which can negatively impact its performance. The model can provide a more comfortable future prediction and better generalize to new data by focusing only on the most essential features. Manual selection, wrapper methods, and filter methods are some ways that feature selection can be carried out. A family of supervised feature selection techniques known as wrapper methods evaluates various subsets of features before deciding on the best one using a particular machine learning algorithm scoring (Géron 2019). This study prefers Stochastic Gradient Descent (SGD) as a feature scorer. SGD uses a linear function to minimize a selected loss function. Using least-squares fitted, mean squared error is used as a loss function. The algorithm approximates the true gradient by considering one sample at a time while updating the model based on the loss function's gradient. The learning rate of SGD was determined to be 0.01, and the number of iterations was determined to be 100. When the scores of SGD are ordered from high to low, it is seen that the importance degrees of the features are shown in Fig. 3.

As shown in Fig. 3, the highest score value belongs to the agricultural area attribute. After the preliminary study, it was decided that choosing the four attributes with the highest score would be more appropriate, namely, agricultural area, temperature, biomass type, and humidity. Fifth- and sixth-row attributes are related to temperature, and the property has already been selected in this field. In addition, it was observed that the regression success decreased when more than four features were selected.

Machine learning methods

Several types of algorithms can be used for supervised learning for regression. This study has used Random Forest, KNN, Gradient Boosting, and XGBR of these algorithms. These methods are trendy in the literature, and algorithms have proven to solve many problems. XGBR, in particular, has gained widespread popularity in the machine learning community due to its high performance and ease of use. It is often used in machine learning competitions and as a base model for solving real-world problems.

Random forest

Random Forest regression, which can be used to predict a continuous outcome variable based on one or more input features, was proposed by Breiman (Breiman 2001). It is an ensemble method that combines the predictions of multiple individual models to make a final prediction. Decision trees make up each model in a Random Forest regression model. Decision trees are tree-like models that base their predictions on various choices regarding the input feature values. The final prediction is made at the tree's leaf nodes, where a branching point represents each decision. Using the powerful and widely used Random Forest regression method, one can accomplish various regression tasks, including predicting prices, temperatures, and quantities.

K-nearest neighbors

The KNN machine learning algorithm is straightforward but effective for classification and regression tasks. It can be used to predict an instance based on its nearest neighbors in the feature space because it is predicated on the idea that similar instances typically have similar outcomes. The number of nearest neighbors used to make the prediction is represented by the value k in the KNN algorithm. The prediction can be based on the k-nearest neighbors' average values for regression tasks. KNN does not make any assumptions about the underlying data distribution because it is a non-parametric method. Each prediction necessitates calculating the distances between each pair of dataset instances. It may need data normalization or standardization and may be sensitive to the input feature scale (Géron 2019).

Gradient boosting

Regression tasks that predict a continuous outcome variable based on one or more input features can be performed using the Gradient Boosting machine learning technique. It is an ensemble method that combines the predictions of numerous distinct models to produce a final prediction. The individual models in a Gradient Boosting regression model are decision trees. Decision trees are tree-like models that base their predictions on various choices regarding the input feature values. The individual decision trees are trained sequentially, with each tree attempting to correct the mistakes made by the previous tree, which is the main distinction between Gradient Boosting and other ensemble methods, such as Random Forest. The difference between the predicted and actual values of the outcome variable is measured by fitting the tree to the loss function's negative gradient (Friedman 2001).

Extreme gradient boosting

EXtreme Gradient Boosting (XGBoost), proposed by Chen in 2016, provides a fast and efficient implementation of the Gradient Boosting algorithm for machine learning (Chen and Guestrin 2016). A decision tree is created for each weak learner using the tree-based ensemble method XGBoost to correct errors made by earlier trees. A weighted combination of each tree makes up the final model. During the XGBoost tree's training phase, the split point is determined using an information gain-based greedy algorithm. When optimizing the objective function, a second-order Taylor expansion is used. A regular term is added to simplify the spanning tree's complexity and lessen overfitting. Large-scale problems are exceptionally well suited to the XGBR method, which is superior in speed, efficiency, and scalability when used for regression tasks.

Model evaluation

The model should be tested on the test set after it has been trained to see how well it performs and how generalizable it is. Model evaluation metrics can be used for this. In this study, the root-mean-squared error (RMSE), mean absolute error (MAE), and the coefficient of determination (R²), which are the most used metrics for regression analysis, were used. The equations for these metrics are given in Eqs. (4)–(6), respectively (Hajabdollahi Ouderji et al. 2023). In Eqs. (4)–(6), Y is the target value, $\widehat{Y}$ is the predicted value, $\overline{Y }$ is the mean of the target value, and n is the number of samples.

$${\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {Y_{i} - \hat{Y}_{i} } \right)^{2} }$$

(4)

$${\text{MAE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {Y_{i} - \hat{Y}_{i} } \right|$$

(5)

$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {Y_{i} - \hat{Y}_{i} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {Y_{i} - \overline{Y}_{i} } \right)^{2} }}$$

(6)

The effectiveness of machine learning algorithms is also assessed using statistical tests. The statistical significance of the relationships between our predictions and the actual values can be assessed this way. The Wilcoxon rank-sum test is non-parametric. Non-parametric statistical tests belong to a continuous distribution of samples, which is the main distinction between parametric and non-parametric statistics (Dao 2022). Wilcoxon rank-sum test is used to verify the difference between the model output value and the real value. The null hypothesis is that there is no significant difference in the predictions of the model and target values. If the p-value of the model predictions is less than 0.05, this is considered significantly different, and the null hypothesis is rejected; otherwise, the hypothesis is accepted.

Analysis of agricultural biomass potential in Türkiye

In this study, the agricultural biomass potential of each province in Türkiye for the years 2010–2021 was modeled using machine learning algorithms. First, a tenfold cross-validation analysis was performed on the dataset for 2010–2020, and then, the prediction for 2021 was made. For the training of machine learning algorithms, analyses were carried out using all features (the type of biomass, year, agricultural area, temperature, maximum temperature, minimum temperature, humidity, wind speed, precipitation, soil temperature at 5 cm, and sunshine duration) and only four features (the type of biomass, agricultural area, temperature, and humidity) determined during the feature selection stage. Python programming language (Python 3.8) implements machine learning algorithms and data analysis. As one of the machine learning libraries, the Scikit-learn library was preferred because it has a rich environment and is widely used in the data science community. Different values have been tried for hyperparameters for machine learning algorithms: the minimum sample on the leaf for Random Forest and the number of neighbors for KNN. All other parameters are selected as default settings in the relevant libraries.

Analysis of agricultural biomass potential in Türkiye for the years 2010–2020

The dataset created for the years 2010–2020 includes a total of 1782 samples in vegetable and grain biomass types. Each sample contains the amount of agricultural biomass potential as a target value against the climate data of a city in the relevant year. Tenfold cross-validation was used to validate the modeling success of machine learning algorithms on the dataset, and the performance scores obtained are shown in Table 4.

Table 4

Modeling of agricultural biomass potential for 2010–2020

Number of features	Algorithm	Param	Train			Test
			RMSE	MAE	R²	RMSE	MAE	R²	Wilcoxon rank
			RMSE	MAE	R²	RMSE	MAE	R²	h	p
11	Random Forest	leaf = 1	444,547	192,974	0.9903	1,175,017	519,010	0.9328	+	0.064
		leaf = 3	690,184	286,129	0.9767	1,273,628	546,327	0.9210	–	0.043
	KNN	k = 2	705,126	307,614	0.9757	1,287,123	602,813	0.9193	+	0.053
		k = 5	1,085,099	524,093	0.9423	1,396,914	679,348	0.9050	–	0.000
	Gradient Boosting	Default	838,456	509,298	0.9657	1,405,078	738,112	0.9038	–	0.004
	XGBR	Default	67,471	44,555	0.9998	1,270,879	575,100	0.9213	+	0.288
4	Random Forest	leaf = 1	467,307	204,352	0.9893	1,273,104	552,269	0.9211	+	0.057
		leaf = 3	797,415	344,105	0.9689	1,342,716	584,232	0.9122	–	0.043
	KNN	k = 2	725,775	323,909	0.9742	1,314,307	616,668	0.9159	+	0.455
		k = 5	998,293	486,228	0.9513	1,250,935	611,301	0.9238	–	0.028
	Gradient Boosting	Default	946,409	554,566	0.9563	1,482,701	758,590	0.8929	–	0.009
	XGBR	Default	163,441	100,465	0.9987	1,461,690	654,355	0.8959	+	0.304

In Table 4, "Param" represents the parameters, leaf represents the minimum sample on the leaf, k is the number of neighbor parameters, h is the statistical test result (“+”: accept and “−”: reject), the p-value of a test is the probability that the null hypothesis is true, bold values represent the best performance results for the Test set for each number of features, and "Default" represents the default parameters on the Scikit-learn library.

As shown in Table 4, the best performance values are obtained for Random Forest (leaf = 1) by using all the features. When Wilcoxon rank-sum test results are examined, it is seen that there is no significant difference between the results of this model and the actual values at the significance level of p = 0.05. On the other hand, when the results obtained when only four features are used are examined, it is seen that the Random Forest (leaf = 1) and KNN (k = 5) methods give the best results with R² values of 0.9211 and 0.9238, respectively. However, when the Wilcoxon rank-sum test results of the KNN (k = 5) method are examined, the null hypothesis that there is no significant difference between the results of this model and the actual values is rejected at the significance level of p = 0.05. Accordingly, it provides strong evidence for the alternative hypothesis, revealing a significant difference between the results of the model and the actual values. Therefore, it is concluded that the Random Forest algorithm is more successful and reliable for cross-validation analysis. The comparison of the predictions of the Random Forest algorithm (leaf = 1), which gave the most successful result in the tenfold cross-validation analysis, with the theoretical calculations is shown in Fig. 4.

In Fig. 4, the sample represents the record of each province for each year between 2010 and 2020, while the value represents the agricultural biomass potential of the relevant record. Random Forest predictions are consistent with the original curve, and the results are similar in the box plot given as an inset plot.

Analysis of agricultural biomass potential in Türkiye for the year 2021

The dataset created for the year 2021 includes a total of 162 samples in vegetable and grain biomass types. The target value for each sample is the agricultural biomass potential associated with a city's climate data for the year 2021. Validation for this dataset is calculated by considering the data predictions for 2021 by the machine learning algorithms trained with the previous years. The performance scores obtained for the 2021 predictions of the algorithms are shown in Table 5. Bold values represent the best performance results for the Test set for each number of features in Table 5.

Table 5

Modeling of agricultural biomass potential for 2021

Number of features	Algorithm	Param	Train			Test
			RMSE	MAE	R²	RMSE	MAE	R²	Wilcoxon rank
			RMSE	MAE	R²	RMSE	MAE	R²	h	p
11	Random Forest	leaf = 1	405,619	182,453	0.9920	1,497,720	702,730	0.9206	+	0.320
		leaf = 3	652,995	271,316	0.9792	1,665,375	751,171	0.9019	+	0.341
	KNN	k = 2	695,129	303,128	0.9765	1,329,730	563,796	0.9374	+	0.284
		k = 5	1,028,640	504,380	0.9485	1,545,439	765,054	0.9155	+	0.066
	Gradient Boosting	Default	839,443	503,528	0.9657	1,498,785	821,458	0.9205	+	0.323
	XGBR	Default	75,674	49,196	0.9997	1,194,713	637,291	0.9495	+	0.510
4	Random Forest	leaf = 1	461,767	199,767	0.9896	1,265,119	631,247	0.9434	+	0.573
		leaf = 3	774,460	336,041	0.9708	1,519,927	709,140	0.9182	+	0.540
	KNN	k = 2	723,121	319,897	0.9745	1,115,222	585,629	0.9560	+	0.769
		k = 5	987,096	483,010	0.9525	1,270,772	663,559	0.9429	+	0.421
	Gradient Boosting	Default	967,671	558,748	0.9544	1,776,675	881,727	0.8883	+	0.392
	XGBR	Default	162,332	100,506	0.9987	1,492,456	718,437	0.9212	+	0.885

The most successful result was obtained as a 0.9560 R² value with the KNN algorithm (k = 2) when four features were used. Feature selection has increased the success in making future predictions and enabled the making of these predictions with fewer features. In addition, the XGBR method obtained a 0.9495 R² value when all features were used, while Random Forest (leaf = 1) obtained a 0.9434 R² value when four features were used. When Wilcoxon rank-sum test results are examined, it is seen that there was no significant difference at the significance level of p = 0.05 between the results of all methods and the actual values when both all features and four features were used. The comparison of the predictions of the KNN algorithm (k = 2 and feature = 4), which gave the most successful result in the year 2021 analysis, with the theoretical calculations is shown in Fig. 5.

As shown in Fig. 5, the KNN predictions are consistent with the original curve, and residual errors appear relatively low except for a few samples. Also, the box plots for the model's results and actual values are similar. The box plot was given as an inset plot. Although the KNN method seems to be more successful for the prediction of 2021, it is seen that the Random Forest (leaf = 1) is generally more stable and prosperous when examined both in cross-validation analysis and for the predictions of 2021. The comparison of the predictions of the Random Forest algorithm (leaf = 1 and feature = 4), which gave the most successful result on average in 2021, and cross-validation analysis with the theoretical calculations is shown in Fig. 6.

As shown in Fig. 6, Random Forest predictions are consistent with the original curve, and the results are similar in the box plot given as an inset plot.

Discussion

Regarding the results obtained, the original data and KNN prediction values for 2021 were obtained as a total of 308,888 TJ and 327,122 TJ for field products and a total of 77,002 TJ and 74,395 TJ for garden products, respectively.

For field products, the minimum value of the original data for 2021 was 29 TJ, and the KNN estimate was 39 TJ, while the maximum value was 39,223 TJ in the original data and 39,363 TJ in the prediction.

For garden products, the minimum value of the original data for 2021 was 0.8 TJ, and the KNN estimate was 0.5 TJ, while the maximum value was 8079 TJ in the original data and 6908 TJ in the prediction.

In addition, regarding average values, the original values for 2021 field products are 3813 TJ, the KNN prediction is 4038 TJ, and the margin of error is 5.9%. The average original value for garden products was 950 TJ, the KNN prediction was 918 TJ, and the margin of error was 3.3%. Accordingly, it can be seen that the difference between the predicted values and the original values is slight.

Like this study, Avcıoğlu et al. (2019) obtained the agricultural biomass potential for Türkiye as 298,955 TJ for total field crops and 65,491 TJ for horticultural crops for 2015 (Avcıoğlu et al. 2019). When the original 2021 data calculated in this study was compared with the 2015 data, it was seen that there was an increase of 3.3% for field crops and 17.5% for garden crops. Senocak and Guner Goren (2022) used 16 years of data from agricultural and animal biomass resources for a small selected region in Türkiye and estimated the amount of energy using SVR-GIS. The theoretical total energy in the 3 years was estimated as 27,088.095, 27,165.993, and 26,862.373 TEP/year, respectively (Senocak and Guner 2022). Tumen Ozdil and Caliskan (2022) calculated the theoretical energy potential of 2,825,932 × 10¹² J from field crops and 752,031 × 10¹² J from garden plants, using the biomass potential in Türkiye between 2008 and 2018 (Tumen Ozdil and Caliskan 2022).

Conclusions

Türkiye is a wealthy country in terms of agricultural diversity due to its geographical location and climatic conditions. Along with agricultural diversity, crop residues originating from fields and gardens, which can be a significant energy source, are formed. Agricultural residues are generally used as animal feed in Türkiye or directly incinerated for primary thermal heating. The agricultural sector in Türkiye has developed over the years using modern agricultural methods, and its importance has increased in recent years as the surrounding countries are at war. Many different support mechanisms are established for farmers in Türkiye to develop the sector. Some of them include seed support, fuel support, fertilizer support, energy support, tractor support, appropriate agricultural credit support, and irrigation support. With the sector's development, the amount of agricultural products has increased every year, so the amount of product residue, which is a source of biomass, has also increased. Agricultural residue management is critical in biomass energy production and use in terms of its features, such as an alternative energy source and environmental friendliness. This study calculated the energy production potential obtained from agricultural products for Türkiye between 2010 and 2021. The total theoretical energy potential obtained from field and garden products has been calculated as 222,620 TJ and 61,737 TJ for 2010, respectively, and 308,888 TJ and 77,002 TJ for 2021.

This study determined the most successful features as agricultural area, temperature, biomass type, and humidity with the SGD method. Different models and hyperparameters were tested with cross-validation analysis using four features for 2010–2020, and the best model for agricultural biomass potential was determined. In addition, the biomass potential prediction for 2021 was performed using 11 years of data. Based on the environmental and climate data covering 2010–2020, the agricultural biomass potential for 2021 was estimated using machine learning algorithms. When the results of the tenfold cross-validation and predictions for 2021 were examined, it was concluded that the Random Forest algorithm (leaf = 1) was successful and stable. Accordingly, the Random Forest algorithm (leaf = 1) obtained 0.9211 and 0.9434 R² values for tenfold cross-validation and 2021 prediction using four features (agricultural area, temperature, biomass type, and humidity), respectively. In addition, when four features are used, the KNN algorithm (k = 2) has achieved an R² value of 0.9560 only in the prediction of 2021. When all features are used, the Random Forest algorithm (leaf = 1) reached 0.9328 R² in tenfold cross-validation analysis, while the XGBR method reached 0.9495 R² in 2021 predictions.

The successful results have shown that machine learning methods can be used to model agricultural biomass potential. By using environmental and climatic values instead of too many variables from too many categories that need to be known for the theoretical calculation, it is provided to perform the calculation much more comfortably. In addition, thanks to the feature selection, these variables were reduced to four, making the calculation even easier with a high estimation rate.

With the proposed model, web and mobile applications that can estimate biomass potential can be created using agricultural field information and meteorological data published online. In this way, decision mechanisms can be developed regarding industrial production, updating the installed capacity of the biomass power plants in Türkiye and which regions they should focus on for agricultural waste to be collected. Such applications can also contribute to Türkiye's renewable energy policies and achieve its zero-emission target.

The model developed in this study was created only according to Türkiye's climatic parameters and estimated the agricultural biomass potential. The model can be recreated with data and parameters for another country or other biomass resources.

This study offers a detailed perspective on improving energy management practices in agriculture. Additionally, it is thought that machine learning algorithms will contribute to the literature for institutions and researchers that promote sustainable development in the sector. The efficient use of renewable energy resources in agriculture and artificial intelligence-supported energy efficiency management practices can help achieve sustainable agriculture goals.

Acknowledgements

The authors would like to thank the Turkish State Meteorological Service for providing meteorological data.

Declarations

Competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Asghar N, Majeed MT, Khan MW, Anwar A (2022) Biomass energy consumption and sustainable development: empirical evidence from Asian economies. Environ Sci Pollut Res 1:1–16. https://doi.org/10.1007/s11356-022-22132-7CrossRef

Avcıoğlu AO, Dayıoğlu MA, Türker U (2019) Assessment of the energy potential of agricultural biomass residues in Turkey. Renew Energy 138:610–619. https://doi.org/10.1016/j.renene.2019.01.053CrossRef

Aydin M (2019) The effect of biomass energy consumption on economic growth in BRICS countries: a country-specific panel data analysis. Renew Energy 138:620–627. https://doi.org/10.1016/J.RENENE.2019.02.001CrossRef

Bakay MS, Ağbulut Ü (2021) Electricity production based forecasting of greenhouse gas emissions in Turkey with deep learning, support vector machine and artificial neural network algorithms. J Clean Prod 285:125324. https://doi.org/10.1016/j.jclepro.2020.125324CrossRef

Balat M (2005) Use of biomass sources for energy in Turkey and a view to biomass potential. Biomass Bioenerg 29:32–41. https://doi.org/10.1016/J.BIOMBIOE.2005.02.004CrossRef

Balsalobre-Lorente D, Driha OM, Bekun FV, Osundina OA (2019) Do agricultural activities induce carbon emissions? The BRICS experience. Environ Sci Pollut Res 26:25218–25234. https://doi.org/10.1007/S11356-019-05737-3/TABLES/7CrossRef

Barbera AC, Vymazal J, Maucieri C (2019) Greenhouse gases formation and emission. Encycl Ecol. https://doi.org/10.1016/B978-0-12-409548-9.10895-4CrossRef

Bilandzija N, Voca N, Jelcic B, Jurisic V, Matin A, Grubor M et al (2018) Evaluation of Croatian agricultural solid biomass energy potential. Renew Sustain Energy Rev 93:225–230. https://doi.org/10.1016/J.RSER.2018.05.040CrossRef

Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324/METRICSCrossRef

Can A (2022) Investigation of provincial capacity to produce biogas from waste disposal sites in Turkey. Energy 258:124778. https://doi.org/10.1016/J.ENERGY.2022.124778CrossRef

Cekinir S, Ozgener O, Ozgener L (2022) Türkiye’s energy projection for 2050. Renew Energy Focus 43:93–116. https://doi.org/10.1016/J.REF.2022.09.003CrossRef

Chang MC, Hu JL (2019) A long-term meta-frontier analysis of energy and emission efficiencies between G7 and BRICS. Energy Effic 12:879–893. https://doi.org/10.1007/S12053-018-9696-7/FIGURES/4CrossRef

Channi HK, Singh M, Brar YS, Dhingra A, Gupta S, Singh H et al (2022) Agricultural waste assessment for the optimal power generation in the Ludhiana district, Punjab, India. Mater Today Proc 50:700–708. https://doi.org/10.1016/J.MATPR.2021.04.481CrossRef

Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the ACM SIGKDD international conference on Knowledge discovery and data mining, Association for Computing Machinery, pp 785–94. https://doi.org/10.1145/2939672.2939785.

D’Adamo I, Falcone PM, Ferella F (2019) A socio-economic analysis of biomethane in the transport sector: the case of Italy. Waste Manag 95:102–115. https://doi.org/10.1016/J.WASMAN.2019.06.005CrossRef

Dao PB (2022) On Wilcoxon rank sum test for condition monitoring and fault detection of wind turbines. Appl Energy 318:119209. https://doi.org/10.1016/J.APENERGY.2022.119209CrossRef

Erat S, Telli A, Ozkendir OM, Demir B (2021) Turkey’s energy transition from fossil-based to renewable up to 2030: milestones, challenges and opportunities. Clean Technol Environ Policy 23:401–412. https://doi.org/10.1007/s10098-020-01949-1CrossRef

Filipović S, Lior N, Radovanović M (2022) The green deal – just transition and sustainable development goals Nexus. Renew Sustain Energy Rev 168:112759. https://doi.org/10.1016/J.RSER.2022.112759CrossRef

Friedman JH (2001) Greedy function approximation : a gradient boosting machine. Ann Stat 29:1189–1232CrossRef

Géron A (2019) Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media, Sebastopol

Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, London

Guler D, Buttenfield BP, Charisoulis G, Yomralioglu T (2022) Comparative analysis of bioenergy potential and suitability modeling in the USA and Turkey. Sustain Energy Technol Assess 53:102626. https://doi.org/10.1016/J.SETA.2022.102626CrossRef

Güney T, Kantar K (2020) Biomass energy consumption and sustainable development. Int J Sustain Dev World Ecol 27:762–767. https://doi.org/10.1080/13504509.2020.1753124/SUPPL_FILE/TSDW_A_1753124_SM4857.RARCrossRef

Hajabdollahi Ouderji Z, Gupta R, Mckeown A, Yu Z, Smith C, Sloan W et al (2023) Integration of anaerobic digestion with heat Pump: Machine learning-based technical and environmental assessment. Bioresour Technol 369:128485. https://doi.org/10.1016/J.BIORTECH.2022.128485CrossRef

Hatunoğlu EE (2010) The Impacts of biofuel policies on Agriculture Sector. İktisadi Sektörler ve Koordinasyon Genel Müdürlüğü, Planning Expertise Thesis 2010; Ankara

Hiloidhari M, Baruah DC (2011) Crop residue biomass for decentralized electrical power generation in rural areas (part 1): investigation of spatial availability. Renew Sustain Energy Rev 15:1885–1892. https://doi.org/10.1016/J.RSER.2010.12.010CrossRef

Jayarathna L, Kent G, O’Hara I, Hobson P (2020) A geographical information system based framework to identify optimal location and size of biomass energy plants using single or multiple biomass types. Appl Energy 275:115398. https://doi.org/10.1016/J.APENERGY.2020.115398CrossRef

Jayarathna L, Kent G, O’Hara I, Hobson P (2022) Geographical information system based fuzzy multi criteria analysis for sustainability assessment of biomass energy plant siting: a case study in Queensland, Australia. Land Use Policy 114:105986. https://doi.org/10.1016/J.LANDUSEPOL.2022.105986CrossRef

Kaygusuz K (2010) Sustainable energy, environmental and agricultural policies in Turkey. Energy Convers Manag 51:1075–1084. https://doi.org/10.1016/J.ENCONMAN.2009.12.012CrossRef

Khanlari A, Sözen A, Şirin C, Tuncer AD, Gungor A (2020) Performance enhancement of a greenhouse dryer: analysis of a cost-effective alternative solar air heater. J Clean Prod 251:119672. https://doi.org/10.1016/J.JCLEPRO.2019.119672CrossRef

Knápek J, Králík T, Vávrová K, Weger J (2020) Dynamic biomass potential from agricultural land. Renew Sustain Energy Rev 134:110319. https://doi.org/10.1016/J.RSER.2020.110319CrossRef

Kumaş K, Akyüz AÖ, Temiz D, Güngör A (2019) Biomass to energy: the potential of biogas in Turkey and World. J Voc Sci 8:70–77

Ma J, Oppong A, Acheampong KN, Abruquah LA (2018) Forecasting renewable energy consumption under zero assumptions. Sustain 10:576. https://doi.org/10.3390/SU10030576CrossRef

Melikoglu M, Menekse ZK (2020) Forecasting Turkey’s cattle and sheep manure based biomethane potentials till 2026. Biomass Bioenerg 132:105440. https://doi.org/10.1016/j.biombioe.2019.105440CrossRef

Moustakas K, Parmaxidou P, Vakalis S (2020) Anaerobic digestion for energy production from agricultural biomass waste in Greece: capacity assessment for the region of Thessaly. Energy 191:116556. https://doi.org/10.1016/J.ENERGY.2019.116556CrossRef

Ocak S, Acar S (2021) Biofuels from wastes in Marmara Region, Turkey: potentials and constraints. Environ Sci Pollut Res 28:66026–66042. https://doi.org/10.1007/s11356-021-15464-3CrossRef

Ozturk M, Saba N, Altay V, Iqbal R, Hakeem KR, Jawaid M et al (2017) Biomass and bioenergy: an overview of the development potential in Turkey and Malaysia. Renew Sustain Energy Rev 79:1285–1302. https://doi.org/10.1016/J.RSER.2017.05.111CrossRef

Pence I, Kumaş K, Siseci MC, Akyüz A (2023) Modeling of energy and emissions from animal manure using machine learning methods: the case of the Western Mediterranean Region, Turkey. Environ Sci Pollut Res 30:22631–22652. https://doi.org/10.1007/s11356-022-23780-5CrossRef

Rincon L, Puri M, Kojakovic A, Maltsoglou I (2019) The contribution of sustainable bioenergy to renewable electricity generation in Turkey: evidence based policy from an integrated energy and agriculture approach. Energy Policy 130:69–88. https://doi.org/10.1016/j.enpol.2019.03.024CrossRef

Riva G, Foppapedretti E, Carolis C. Handbook on Renewable Energy Sources-Biomass. Ener Supply; 2014.

RTMAF. Republic of Turkey Ministry of Agriculture And Forestry 2022. https://hbs.tarbil.gov.tr/ (accessed February 6, 2022).

RTME&NR (2024) Republic of Turkey Ministry of Energy and Natural Resources. https://www.enerji.gov.tr/. Accessed 20 Feb 2024.

Samadi SH, Ghobadian B, Nosrati M (2020) Prediction and estimation of biomass energy from agricultural residues using air gasification technology in Iran. Renew Energy 149:1077–1091. https://doi.org/10.1016/J.RENENE.2019.10.109CrossRef

Sayin C, NisaMencet M, Ozkan B (2005) Assessing of energy policies based on Turkish agriculture: current status and some implications. Energy Policy 33:2361–2373. https://doi.org/10.1016/J.ENPOL.2004.05.005CrossRef

Senocak AA, Guner GH (2022) Forecasting the biomass-based energy potential using artificial intelligence and geographic information systems: a case study. Eng Sci Technol an Int J 26:100992. https://doi.org/10.1016/j.jestch.2021.04.011CrossRef

Şenol H, Ali Dereli M, Özbilgin F (2021) Investigation of the distribution of bovine manure-based biomethane potential using an artificial neural network in Turkey to. Renew Sustain Energy Rev 149:111338. https://doi.org/10.1016/j.rser.2021.111338CrossRef

Singh J (2015) Overview of electric power potential of surplus agricultural biomass from economic, social, environmental and technical perspective—A case study of Punjab. Renew Sustain Energy Rev 42:286–297. https://doi.org/10.1016/J.RSER.2014.10.015CrossRef

Singh J (2016) A roadmap for production of sustainable, consistent and reliable electric power from agricultural biomass—an Indian perspective. Energy Policy 92:246–254. https://doi.org/10.1016/J.ENPOL.2016.02.013CrossRef

Tanasa C, Dan D, Becchio C, Corgnati SP, Stoian V (2020) Cost-optimal and indoor environmental quality assessment for residential buildings towards EU long-term climate targets. Energy Sustain Dev 59:49–61. https://doi.org/10.1016/J.ESD.2020.09.002CrossRef

Telli A, Erat S, Demir B (2021) Comparison of energy transition of Turkey and Germany: energy policy, strengths/weaknesses and targets. Clean Technol Environ Policy 23:413–427. https://doi.org/10.1007/S10098-020-01950-8/FIGURES/4CrossRef

TETC (2024) Turkish Electricity Transmission Company. https://www.teias.gov.tr/. Accessed 20 Feb 2024.

Toklu E (2017) Biomass energy potential and utilization in Turkey. Renew Energy 107:235–244. https://doi.org/10.1016/J.RENENE.2017.02.008CrossRef

TSMS. Turkish State Meteorological Service 2022. https://mgm.gov.tr/eng/forecast-cities.aspx (accessed November 18, 2022).

TUIK. Turkish Statistical Institute 2022. https://data.tuik.gov.tr/Kategori/GetKategori?p=Nufus-ve-Demografi-109 (accessed January 5, 2022).

Tumen Ozdil NF, Caliskan M (2022) Energy potential from biomass from agricultural crops: development prospects of the Turkish bioeconomy. Energy 249:123770. https://doi.org/10.1016/J.ENERGY.2022.123770CrossRef

Yurtkuran S (2021) The effect of agriculture, renewable energy production, and globalization on CO₂ emissions in Turkey: a bootstrap ARDL approach. Renew Energy 171:1236–1245. https://doi.org/10.1016/j.renene.2021.03.009CrossRef

Zafar MW, Sinha A, Ahmed Z, Qin Q, Zaidi SAH (2021) Effects of biomass energy consumption on environmental quality: the role of education and technology in Asia-Pacific Economic Cooperation countries. Renew Sustain Energy Rev 142:110868. https://doi.org/10.1016/J.RSER.2021.110868CrossRef

Zheng Y, Qiu F (2020) Bioenergy in the Canadian Prairies: assessment of accessible biomass from agricultural crop residues and identification of potential biorefinery sites. Biomass Bioenerg 140:105669. https://doi.org/10.1016/J.BIOMBIOE.2020.105669CrossRef

Zheng X, Streimikiene D, Balezentis T, Mardani A, Cavallaro F, Liao H (2019) A review of greenhouse gas emission profiles, dynamics, and climate change mitigation efforts across the key climate change players. J Clean Prod 234:1113–1133. https://doi.org/10.1016/J.JCLEPRO.2019.06.140CrossRef

Titel: Detailed analysis of Türkiye's agricultural biomass-based energy potential with machine learning algorithms based on environmental and climatic conditions
verfasst von: I. Pence
K. Kumas
M. Siseci Cesmeli
A. Akyüz
Publikationsdatum: 21.03.2024
Verlag: Springer Berlin Heidelberg
Erschienen in: Clean Technologies and Environmental Policy
Print ISSN: 1618-954X
Elektronische ISSN: 1618-9558
DOI: https://doi.org/10.1007/s10098-024-02822-1

Springer Professional

Abstract

Graphical abstract

Publisher's Note

Introduction

Materials and methods

Data collection and theoretical calculation of biomass potential energy

Machine learning processes

Preprocessing the data

Validation

Feature selection

Machine learning methods

Random forest

K-nearest neighbors

Gradient boosting

Extreme gradient boosting

Model evaluation

Analysis of agricultural biomass potential in Türkiye

Analysis of agricultural biomass potential in Türkiye for the years 2010–2020

Analysis of agricultural biomass potential in Türkiye for the year 2021

Discussion

Conclusions

Acknowledgements

Declarations

Competing interest

Publisher's Note