Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Advertisement
Scientific Reports volume 15, Article number: 6544 (2025)
4327
13
Metrics details
The stochastic and variable nature of power generated by photovoltaic (PV) systems can impact grid stability. Accurately predicting the output power of a solar PV power generation system is crucial for addressing this challenge. While short-term PV power prediction is highly accurate, the accuracy of medium- to long-term photovoltaic power predictions will face great challenges. In order to improve the accuracy of medium and long-term photovoltaic power prediction, a unique hybrid deep learning model named interactive feature trend transformer (IFTformer) has been designed. Initially, deep isolated forest (DIF) and local anomaly factor (LOF) are used to construct a parallel framework that serves as the data preprocessing module, removing outliers from raw data. The time series are subsequently decomposed into seasonal and trend components, which are modelled separately for independent study. Ultimately, the predicted trend components with the seasonal components predicted by the ProSparse Self-attention mechanism based on information interaction are fitted by the multilayer perceptron (MLP) for medium- to long-term PV power prediction. The comprehensive experimental results show that the predictive performance of IFTformer is superior to that of baseline models, with a normalised root mean square error (NRMSE) of 3.64% and a normalised mean absolute error (NMAE) of 2.44%. The IFTformer model proposed in this paper is an effective approach for medium- to long-term PV power prediction, can mitigate the impact of outliers, enhance the feature extraction ability, and improve the prediction accuracy, generalizability and robustness of medium- to long-term predictions, providing a novel perspective on medium- to long-term PV power prediction methods based on deep learning methods.
Given the detrimental environmental impact of fossil fuels and the increasing energy demand, renewable energy production plays a pivotal role in achieving sustainable development because it is clean, low-carbon and cost-effective process1. Over the past decade, the global equipped capacity of photovoltaic power generation (PPG) has grown rapidly, accounting for 36.6% of the increase in renewable energy installed capacity2. However, photovoltaic systems are affected by factors such as solar irradiance, temperature, location, and dust coverage3,4, which result in PPG having nonlinear, stochastic, and fluctuating characteristics, posing severe challenges to the operation of the power grid. Especially the imbalance between supply and demand will affect the safe and stable operation of the power grid system5, thereby restricting the promotion and improvement of PPG6,7. The accurate prediction of PV power can effectively alleviate the aforementioned problems and can also be conducive to grid scheduling, thus improving the photovoltaic power consumption of power plants and economic benefits. The research on day-ahead and intra-day short-term photovoltaic power generation forecasting has been in full swing recently and has made significant progress8. However, it cannot support the preparation of medium- to long-term power generation plans that include different time scales such as weekly, 10-days, and monthly, and there is a lack of research on accurate photovoltaic medium- to long-term prediction. Therefore, it is necessary to conduct medium- to long-term prediction of photovoltaic power to help power companies develop appropriate plans earlier and greatly avoid losses9, which is also of great significance for energy conservation, emission reduction, and sustainable development.
For the PV power prediction is a numerous time series models have been proposed by previous authors. The time horizon for photovoltaic power generation prediction can generally be divided into three categories in this paper. Short term prediction is usually considered PPG prediction within 1 day, medium-term PPG prediction is usually considered to be conducted from 2 days to 1 week, and long-term PPG prediction usually refers to a forecast of more than 2 weeks. The prediction process can be broadly divided into two stages: data cleaning and model prediction10. Data cleaning is an essential preliminary step in the prediction process. The identification and removal of anomalous data through data cleaning can increase the prediction accuracy11,12. A approximate linear relationship exists between the output power of a photovoltaic panels and the irradiation to which it is subjected13. This distribution characteristic can be used as the basis for the determination of anomalous data. The main data cleaning methods include the random forest method14, the copula algorithm15, and clustering techniques16, the local anomaly factor (LOF) method17, the two-step quantile algorithm18, and the DBSCAN algorithm19, the sliding standard deviation algorithm20, and so on. These scholars have made notable contributions in anomalous data processing. However, their work has focused primarily on identifying different types of anomalous data via a single identification method. This approach may be limited in its ability to identify anomalous data that do not conform to a specific type.
The prevailing methods for prediction the output of PV power plants can be divided into three kinds: physical models, statistical techniques and hybrid approaches18. Physical modelling methods frequently employ formulas for the calculation of PV power. These formulas are based on the physical characteristics, operating laws of PV power generation systems21,22. However, the specific and empirical parameters of complex PV modules are difficult to control, limiting the accuracy of physical models23. Furthermore, the field of PV power prediction has largely embraced intelligent statistical methods, with prominent examples including auto-regressive exogenous method24, grey prediction25, ANN26, SVM27, ELM models28, and MLP models29. The aforementioned PV power prediction methods frequently exhibit the phenomenon of prediction data drift. Recently, deep learning networks have also been proposed for prediction and modelling purposes. These networks include CNNs30, RNN31, LSTM networks32, AE33, and other hybrid models34. These methods are capable of memorising historical data, thereby enabling the utilisation of memory structures to preserve past time series information. This ability makes these methods more effective than traditional prediction methods with learning prediction methods. In prediction the output of PV power, Transformer and their associated models have gained prominence in recent years35. These methods incorporate attention mechanisms to maintain temporal backwards and forwards dependencies, making it possible to build more powerful predictive models. Jimin Kim et al. employed a hybrid LSTM and Transformer model for multistep prediction of day-ahead PV power, investigating the optimal methodology for processing and inputting weather observation and forecast data to increase prediction efficiency36. The Convolutional Transformer structure was employed by Xinyu Wang et al. to predict each subsequence of the PV power dependent variable, resulting in a low error rate for the construction of PV power systems for different seasons37. Ze Wu et al. improved predictive modeling based on the informant model38,39,40,41. Although some indicators have improved, the modeling effect for the fluctuation of results caused by sudden and rapid changes in data is not ideal. Hang Zhao et al. improved the PatchTST model and established a computational model with strong efficiency, but it still cannot smoothly model data with a high proportion of outliers42,43,44.
The above literature has achieved good prediction results for photovoltaic power, but these literature are all for short-term prediction of PPG time series. However, the medium- to long-term power generation of photovoltaics is greatly affected by meteorological factors. The sequence of long-term photovoltaic power exhibits nonlinear characteristics, and the regularity of medium- to long-term predictions is not as obvious as short-term predictions. So the modeling results may be affected by random fluctuations and exhibit periodicity, resulting in significant errors45,46,47. Therefore, the prediction of medium- to long-term PPG is more difficult and challenging.
The structure of the medium- to long-term prediction model of PPG is basically the same as that of the short-term prediction model, which data cleaning, sequence decomposition, time analysis, and attention mechanism are all necessary. The motivation of this study is to propose a novel deep attention structure, termed Interactive Feature Trend Transformer (IFTformer), for medium- to long-term PV power prediction to improve prediction accuracy. IFTformer is designed to analyse power series from diverse perspectives to improve the model’s performance. The principal innovations and contributions of the model can be summarized as follows:
A novel IFTformer model based on multi-model parallel anomaly data recognition, time series decomposition and attention interaction mechanism is proposed for medium- to long-term PV power prediction.
A novel approach for processing PV power anomaly data is proposed. The anomaly detection model is established using DIF and LOF with different hyperparameters for each photovoltaic data sample point based on a multi-model parallel integration framework. This approach can improve the quality of raw data and ensure the accuracy of photovoltaic power prediction.
The PV power time series will decomposed into trend and seasonal components in this paper. When a multiple layer perceptron (MLP) is employed for trend modelling. Furthermore, the seasonal trend is modelled via an attention mechanism. The predictions from both models are subsequently fitted together can draw the deep nonlinear features shown by PV power series and obtain stable and accurate predictions.
The time dependence of the PV power series is modelled through a Multi-head ProbSparse Self-attention mechanism. The key and value of the time steps before and after are exchanged to carry out information interaction to enhance attention to trend changes. So IFTformer for medium- to long-term PV power prediction can further improve prediction accuracy and reduce computational costs.
Given the inherent mechanistic model of PV panels and the feature selection of PV power data, we have designed IFTformer for medium- to long-term time series forecasting of PV power, as illustrated in Fig. 1. The principal stages of the IFTformer model include identifying aberrant values within the raw dataset, filling any gaps in the data, and selecting pertinent variables affecting PV power. Next, the time series (text{X}) is decomposed into seasonal components ({x}_{s}) and trend components ({x}_{t}) via the moving average kernel method. The weather information is subsequently embedded. The trend component is modelled via Revin normalisation and an MLP. For the seasonal component, the input is a pair of consecutive temporal features, ({x}_{st}) and ({x}_{st+1}). Following the temporal and positional embedding of PV power influencing factors, we utilise ProSparse Self-attention for the computation process, a methodology that effectively reduces the size of the network. The feature information of ({x}_{st+1}), ({x}_{st}), and their difference (({x}_{st+1})−({x}_{st})) is aggregated via a fusion strategy. Finally, the input features are processed via an MLP and inverse normalisation to obtain the final result. The core of the IFTformer is a feature selection module and two attention blocks that can be used to mine deep features from different aspects.
Overall framework of IFTformer.
There are two main types of anomaly data in PPG: stacked and decentralised data. These data types are illustrated in Fig. 2a. For stacked PV power plant anomaly data, the DIF algorithm is frequently employed for data cleaning to identify and remove outliers48. The DIF algorithm (Fig. 2b) is a random neural network. By performing simple axis-parallel segmentation on these new sets, nonlinear isolation can be achieved, allowing the anomalous data to be effectively identified. Decentralised anomalous data, which exhibit a significant density discrepancy compared to normal data, are frequently identified via the local anomaly factor method (LOF method), as illustrated in Fig. 2c. The LOF method is a density-based anomalous data detection method17 that calculates the ratio of the average density of the location of a data point x surrounding the data points in a PV power dataset to the density of the location of that data point. This ratio is defined as the local anomaly factor. A ratio of close to 1 or less indicates normal data, while a higher ratio suggests that the data point is anomalous. For the PV power data point x, which is in different datasets or in different locations of the same dataset, the exhibited characteristics will be inconsistent. The LOF method determines whether x is abnormal on the basis of the kth distance neighbourhood of x, effectively identifying the local anomalies in the PV power data and thus avoiding the problem of identifying the method on the basis of global anomalies only.
Identification results of anomaly data.
For different types of PV plant performance anomaly data, a single identification method is difficult to identify and process from all aspects49. The sequential integration method, in which two recognition methods are applied in sequence, may result in the overdeletion of data. So This article proposes a method for processing anomaly data in PV plant power based on the multimodel parallel integration framework. This method incorporates a heterogeneous base learning model to facilitate the recognition of various types of PV plant anomaly data and employs DCS to identify the anomaly detection model that is most suited to each PV power data sample point. This approach is based on the principle of evaluating the performance of each model in the local neighbourhood of a data point to determine whether the model performs well at that point. The recognition results of the basic learning model are then filtered and combined by generating localised pseudolabels on the basis of the TOPSIS algorithm. The missing values in the dataset are subsequently supplemented following the removal of anomalous data through the application of cubic spline interpolation. The multi-model parallel framework for data cleaning is shown in Fig. 3.
The multimodal parallel framework.
The detailed process of constructing the multimodal parallel framework is as follows:
Construct the model pool. The construction of the basic anomaly detection model pool involves training different basic anomaly detection models on the basis of the DIF and LOF methods with varying hype-rparameters.
Calculate the local anomaly score matrix. The k nearest neighbour (KNN) method is employed to delineate the local nearest-neighbour region ({psi }_{i}) of all the data points in the PV power dataset and is represented by the following equation: (psi _{i} = left{ {begin{array}{*{20}c} {x_{j} {mid }x_{j} in X_{{train}} ,x_{j} in KNN_{{ens}}^{i} } \ end{array} } right}). The (KN{N}_{ens}^{i}) term represents the satisfaction condition of the KNN method. For each data point ({x}_{i}), each data point in its neighbourhood ({psi }_{i}) is input into the basic anomaly detection model pool for identification, resulting in the corresponding local anomaly score transversal component of a vector. These scores are subsequently aggregated to form the local anomaly score matrix (O({psi }_{i})).
Use the TOPSIS method for selecting the optimal basic anomaly detection model.
The TOPSIS method selects the positive ideal solution of each local anomaly score matrix (O({psi }_{i})) as the anomaly detection model for the data points. For each data point, select the positive ideal solution of its local anomaly rating matrix, that is, choose the model with the highest anomaly rating among all models, and believe that this model can best identify anomalies. Calculate the Euclidean distance between each anomaly recognition model and the ideal solution, calculate the relative proximity based on the distance, and obtain the score for each solution. Select 95% or more similar anomaly recognition schemes as the optimal model group. If the number of selected models is one, the resulting detection result is the identification result for ({x}_{i}) Conversely, if the number of models exceeds one, the average of the detection results for each model is the identification result for x_i.
Reconstruct the missing values.
The cubic spline interpolation method employs a segmented polynomial function to approximate the normal data points, thereby filling in the missing values.
Constructing a predictive model, the input variables of the model should be determined first. There is a certain correlation between meteorological factors such as environmental temperature, solar radiation intensity and so on and the power of PPG. The Pearson correlation coefficient R is used to study its relationship with PPG50. The R-value is calculated using the formula (1).
where xi and yi is the primitive values of the input and output variable; (overline{x }) and (overline{y }) is their respective means. If the correlation between two variables is stronger, then the R-value approaches + 1 or − 1. The relationship between the correlation degree and correlation coefficient of two variables is shown in Table 1.
We will ultimately select factors with moderate or above correlation as input parameters for the prediction model.
In this study, we propose decomposing the PV power series after feature engineering into a trend component and a seasonal component. For the trend component, we employ an MLP for trend modelling. For the seasonal component, we utilise an improved probabilistic sparse self-tracking mechanism. Following the learning and prediction of the respective components, the PV power predictions from both models are integrated to yield an overall prediction.
Firstly, the seasonal component Xs and the trend component Xt are then derived from the PV power series X through a moving average decomposition algorithm.
where (Padding(mathcal{X})) ensures that the output sequence has the same length as the original PV power sequence X.
Pseudo-code for decomposition method.
To eliminate the inconsistency of feature dimensions in the trend component, which will reduce the accuracy of prediction, we use the Revin normalisation technique is employed to increase the model’s capacity to accommodate variations in the dataset distribution and to process nonstationary information. The resulting normalised data are expressed as:
The (text{AVG}[mathcal{X}]) is mean of the input time series X, and its variance, represented by (text{Var}[mathcal{X}]), are used to calculate the affine transformation parameters (beta) and (gamma).
A three-layer MLP is employed to predict the trend component in the PV power series.
Pseudo-code for trend component prediction.
The seasonal component is modelled via the Informer architecture. However, unlike traditional models based on ProSparse self-attention mechanisms, our design incorporates an interaction operation of self-attention features to handle the complexity of model learning at sequence mutations. This architectural choice facilitates more accurate capture of the seasonal trend of PV power and enhances the model’s adaptability to cyclical changes. The basic prediction horizon is set to [48, 96, 192, 336, 720]. The entire seasonal prediction process is illustrated in Fig. 4.
Generative decoding.
Each input to the model consist of sequences ({text{X}}_{text{st}}in {text{R}}^{text{L}times text{d}}), where L denotes the sequence length, and d denotes the feature dimension. Pairs of features (({text{X}}_{text{st}}), ({text{X}}_{text{st}+1})) are processed for each set before and after the time steps. In particular, the information interaction operation module aggregates the intrinsic information of the PV power sequences ({text{X}}_{text{st}}) and ({text{X}}_{text{st}+1}), as well as the feature differences (({text{X}}_{text{st}+1})−({text{X}}_{text{st}})), through a fusion strategy. By training the PV power deep learning model, the dimension of the input sequences is gradually reduced. The intermediate embedding sequence ({text{X}}_{text{token}}in {text{R}}^{text{D}times text{d}}) is employed to generate the input to the decoder. Finally, a loss function is employed to compare ({text{X}}_{text{token}}) and ({text{X}}_{text{output}}), and the resulting errors are backpropagated to update the network parameters. This architecture is shown in Fig. 5.
The architecture of Multi-head ProbSparse Self-attention.
In particular, for each layer configuration, a pair of attention matrices can be generated for each layer following the application of the aforementioned formula:
In the case of two feature mappings for the time steps preceding and following ({X}_{st}) and ({X}_{st+1}), it is assumed that the query feature originates from either ({X}_{st}) or ({X}_{st+1}), whereas the key and value feature is derived from the other.
The outputs of the preceding moment’s features and those of the subsequent moment’s features are concatenated, along with the differences between them. The feature interaction process may be described as the following equation:
After three layers of encoder processing, the feature map output from the encoder is produced. The decoder is designed to generate long-sequence predictions in a single forwards process. The model employs a conventional decoder structure comprising two identical multi- head attention layers with generative prediction to address the issue of high time complexity in predicting long sequence data. The decoder input vector is represented as follows:
where ({X}_{token}^{st+1}) is the start token. A long sequence of length ({L}_{token}) is selected as the token from the input sequence of the seasonal trend, i.e., the sequence preceding the sequence to be predicted. ({X}_{0}^{st+1}) represents the target sequence, which is set to 0.
Pseudo-code for feature interaction.
To prevent self-regression, masked multi-head attention is utilized in the sparse self-attention computation, ensuring that each position does not attend to the next position. The final output of the fully connected layer allows for multivariate prediction. This method employs a generative structure in the encoder, enabling the prediction sequence to be generated in a single operation. This approach eliminates the need for the original dynamic decoding operation, significantly reducing the decoding time of the prediction. Ultimately, the trend and seasonal prediction results are aggregated and fed into a two-layer multilayer MLP to generate the final output prediction results, which can be expressed as follows:
To quantitatively compare and analyse the performance of the models, we utilise the following evaluation metrics: the NRMSE, the NMAE and ({text{R}}^{2}). These metrics are defined as follows:
The true value is represented by ({p}_{t}). ({widehat{p}}_{t}) is the predicted value, (overline{p }) is the average value, ({p}_{max}) and ({p}_{min}) are the maximum and minimum values of the true value, respectively.
This section of the study aims to validate the precision of the PV power prediction outcomes through the examination of actual operational historical data under a range of scenarios and temporal scales. The experimental data processing computer includes an i9-13900KF CPU, a GeForce RTX 4080 16G GPU, and the Windows 10 operating system. The experimental platform is Python 3.9 and Pytorch 1.10.0.
In this study, there are NWP data for Australian PV power plants on the Desert Knowledge Australia Solar Centre (DKASC) website, the meteorological information for Australian PV power plant locations on the NASA website, and the theoretical power outputs of the PV panels were obtained through the irradiance-to-power mapping model51. The following variables were considered: wind speed (m/s), temperature (°C), relative humidity (%), global horizontal radiation(W/m2), diffuse horizontal radiation(W/m2), wind direction (°), global radiation tilt (W/m2), PV theoretical value(kW), surface pressure (kPa), specific humidity (g/kg), ultraviolet (UV) index, sky clearness index, and so on. These data is from December 1, 2021, to November 31, 2023. The sampling interval of the power data is 1 h. The data are separated into a training set, verification set and test set in chronological order in proportion of 7:2:1.
After the raw data is cleaned using the multimodel parallel framework, including outlier removal, and missing value handling, the Pearson correlation coefficient (R) was employed to investigate the relationship between the variables in question and the power output of the PV panels, as illustrated in Fig. 6.
Thermal diagram of the correlation between PPG power and meteorological factors.
As shown in Fig. 6, the R values of the power of the PV and these indices were obtained. There is an extremely strong correlation between PV power and global horizontal radiation, global radiation tilt, the theoretical PV value, the UV index, and the all-sky insolation clearness index. We got also a strong correlation between photovoltaic power and temperature. Finally, a moderate correlation between photovoltaic power and relative humidity and diffuse horizontal radiation is obtained. Other factors show weak correlation. In the following prediction model, these impact factors with moderate correlation and above will be used as inputs for the prediction model, namely global horizontal radiation, global radiation tilt, the theoretical PV value, the UV index, the all-sky insolation clearness index, temperature, relative humidity and diffuse horizontal radiation.
To further analyse the sophistication of IFTformer, we also employed a range of alternative models, including ARIMA52, LSTM48, GRU53, XGBoost54, TCN55, Autoformer56, Informer57, DLinear58, and PatchTST59, for comparison and analysis. These models were used for comparison experiments with multiple prediction horizons set at [48, 96, 192, 336, 720]. The prediction results of different models are showcased in Table 2.
As demonstrated in Table 2, among all the prediction models, ARIMA, a traditional machine learning model, achieved the lowest optimal prediction performance. This outcome highlights the limitations of traditional machine learning models in achieving satisfactory medium- to long-term PV power predictions. Compared with the accuracy of conventional deep learning models, which are commonly employed in this field, the IFTformer model for medium– to long-term photovoltaic power prediction has superior accuracy. Compared with the TCN, LSTM, GRU and XGBoost models, the NRMSE of the IFTformer prediction model was found to decrease by 39.97%, 42.92%, 47.35% and 51.68%, respectively, on average. Similarly, the NMAE of the IFTformer prediction model decreased by 35.71%, 38.90%, 44.92% and 27.42%, respectively, on average. The R2 value of the IFTformer model increased by 1.55%, 3.25%, 4.56% and 10.86%, respectively, compared with other models. These traditional deep network models cannot perform well in analysing and modelling irregular and varying data. The IFTformer model of medium- and long-term photovoltaic power prediction has better accuracy.
Compared with the attention-based models Autoformer and Informer, as well as the popular linear model DLinear and the channel-based model PatchTST, the IFTformer prediction model demonstrates superior accuracy and generalizability. As the prediction horizon increases, the IFTformer prediction model exhibits a reduction in error growth and achieves the optimal overall performance. This performance is achieved by analysing the PV power trends from different perspectives and performing information interaction operations, thereby enabling the next time step to effectively learn from previous information.
The IFTformer outperforms the Transformer-based model, primarily because its accurate identification and modelling of raw data and trends exceeds those of the traditional Transformer model, thereby enhancing its prediction accuracy. Compared with hybrid structural models, the IFTformer prediction model achieves greater prediction accuracy from the root source through data cleaning and feature engineering of the original data. Additionally, the IFTformer forecasting model excels in forecasting trends and seasonality. The proposed information interaction operation effectively enhances its capacity to handle complex time series, thereby improving the accuracy and stability of the prediction. In summary, the superior ability of the IFTformer to identify anomalous data, handle trend changes and manage seasonal variations makes it significantly more effective than other models in predicting complex time series data.
To provide a more intuitive comparison of the predictive performance of the 10 models mentioned above, including IFTformer and Informer, visualisation curves of five prediction horizons of 48, 96, 192, 336 and 720 are generated and presented in Fig. 7. Since the prediction horizons of 192, 336, and 720 are challenging to discern owing to the multitude of time steps depicted in the figure, we have zoomed in on 24 time steps to facilitate easier viewing.
Comparison the prediction results of the baseline model and the IFTformer across different prediction horizons.
Figure 7 shows the prediction results of the baseline model and the proposed IFTformer across different prediction horizons. The dark red line representing the IFTformer demonstrates a high degree of alignment with the original data, with minimal fluctuations across the various prediction horizons. These sections correspond to PVs that produce essentially no power at night, allowing these time steps to be excluded without loss of information. Importantly, models for deep networks, such as GRU and LSTM networks, are susceptible to time series lag, which can lead to an overdependence on values from the past few time points. Models based on the attention mechanism demonstrate slight improvements over those of deep networks; however, they still cannot achieve stable performance in long-term predictions, especially in the tail end of the time series. In contrast, the IFTformer generates prediction sequences that closely align with the actual values at the extreme points of the cycle. Although DLinear and PatchTST can maintain stable prediction performance under different prediction horizons, when the peak fluctuations in PV power are pronounced, the prediction results fail to reflect the actual value and instead resemble the previous time step. This may render these models incapable of capturing sudden fluctuations. In conclusion, the proposed IFTformer exhibits considerable adaptability and reliability in PV power prediction across a range of prediction horizons.
In addition, 95% confidence interval prediction was conducted on the prediction results of IFTformer models at different prediction horizons, as shown in Fig. 8. From Fig. 8, it can be seen that the prediction results of our model in different prediction horizons show that the width of the prediction interval during the noon period is significantly larger than others, indicating that the uncertainty of photovoltaics during the noon period is higher. At the same time, we can also see that regardless of the prediction horizons, the actual power output of photovoltaics basically falls within the prediction interval. This indicate that the IFTformer model in this paper can achieve good coverage of the prediction points.
Comparison the Interval prediction results across different prediction horizons.
In the IFTformer model, after first preprocessing the data, we model the seasonal and residual trends separately and finally innovatively fit them to acquire the predicted results. To evaluate the contribution of each component, we conducted experiments involving the substitution of individual components to ascertain their relative importance. In each substitution model, only one module is replaced, while the remainder of the components are identical to those of the IFTformer. We use a data cleaning method based on a smooth moving average to replace the data preprocessing module in the IFTformer model and obtain the IFT-AVG model. The IFT-MLP model is obtained by replacing the residual trend prediction module with the MLP module. Finally, the IFE-TRANS model is obtained by replacing the seasonal trend prediction module with the Transformer module. The prediction results of each model at all prediction horizons are presented in Table 3. The overall results are similar across different prediction horizons, and IFTformer has smaller and more stable prediction errors, which is a result better than that of other models of the same type.
Compared with IFT-AVG, the proposed IFTformer was found to reduce the NRMSE by 22.22% and the NMAE by 29.47% on average. This findings demonstrates the efficacy of our proposed data cleaning methodology, which incorporates feature engineering operations, in addressing time series problems.
Compared with IFT-MLP, the IFTformer model with RevIN regularisation provides more accurate PV power prediction results, with average reductions of 11.43% and 24.69% in the NRMSE and NMAE, respectively. This finding illustrates the significance of this approach in handling smooth data and the substantial enhancement in the capacity to address the lag of linear models.
Compared with IFE-TRANS, the model with a sparse attention mechanism and an information interaction module is more adept at deeply learning the intrinsic temporal relationships in the data than the traditional attention mechanism model is. This conclusion is evidenced by average reductions of 27.49% and 30.28% in the NRMSE and NMAE, respectively. Therefore, the IFTformer model has stronger generalizability and efficiency in modelling trend data.
To verify the contribution of each component in the IFTformer model and evaluate the performance of the improved prediction model, we designed the following 8 ablation experiments: (1) The IFTformer prediction model; (2) removing data preprocessing from the IFTformer and retaining only the prediction module; (3) removing the mechanism value of the photovoltaic power; (4) removing the RevIN regularisation operation; (5) replacing the attention module with the linear module; (6) replacing the linear module with the attention module; (7) removing the seasonality and trend decomposition of the time series and simply averaging the results of the attention mechanism module and the linear module; and (8) removing the information exchange module. The dataset utilised in the experiments was as previously described. The predicted horizon is ∈ {48, 96, 192, 336, 720}. The ablation experimental results are shown in Table4.
As illustrated in Table 4, the predictive performance of IFTformer is superior to that of the other prediction models lacking specific components. This evidence suggests that preprocessing, photovoltaic power mechanism values, regularisation, data decomposition, the linear module, the attention module, and the information interaction module collectively contribute to medium- to long-term PV power predictions. The results of experiment (2) demonstrate that the IFTformer model without data preprocessing results in a significant error, indicating that the impact of data preprocessing on subsequent predictions is considerable. Following the completion of experiments (3) and (4), the implementation of preparatory work, such as regularisation and mechanism prediction model building, can significantly increase the prediction accuracy. Experiments (5) and (6) demonstrate that IFTformer outperforms the single model in terms of predictive accuracy. The experimental results demonstrate that the two distinct prediction modules developed in this study are effective. Furthermore, Experiment (7) demonstrates that time series decomposition into trend and seasonal components enhances the network’s feature extraction ability, thereby improving the prediction performance. Similarly, the results of Experiment (8) demonstrate a notable increase in error for IFTformer in the absence of the information interaction module, underscoring the significance of the information interaction operation for accurately capturing time information at a deep level.
In this study, we fully utilise the advantages of the DIF and LOF recognition algorithms to construct a DIF-LOF parallel framework for data cleaning. The experimental results are shown in Fig. 9.
Abnormal data cleaning effect via DIF-LOF.
The effects of different anomalous data processing methods on the final prediction results are explored. The following frameworks are considered: C1–C6 represent the parallel framework based on the DIF and LOF algorithms, the LOF algorithm followed by the DIF algorithm, the DIF algorithm followed by the LOF algorithm, the LOF algorithm, the DIF algorithm, and the use of raw data, respectively. Experiments were conducted on numerous different models, including the IFTformer model with a predicted field of view of 720, as well as the LSTM and Informer models. The results of the experiments are presented in Fig. 10.
Performance of different abnormal data cleaning methods via different models.
As illustrated in Fig. 10, the NRMSE and NMAE values of the IFTformer model that removes abnormal data are reduced to varying degrees, and the accuracy of the prediction model is improved compared with that of model using the original dataset for training.
Compared with the dataset without data cleaning method C6, the dataset processed via the method described in this article for PV power prediction via the IFTformer model lead to reductions in the NRMSE and NMAE of 0.5% and 1.3%, respectively. The improvement effect of the two sequential integration methods C2 and C3 on prediction accuracy is slightly inferior to that of method C1 in this paper. The reduction in the error values of the two single models C4 and C5 is minimal, and the improvement effect on prediction accuracy is not as good as that of the integrated method proposed in this paper. In addition, after the same abnormal data processing method is used, the NMAE values and NRMSE values differ among the different LSTM and Informer models, and the predictive indicators of C1 are superior to those of C2–C6. The experimental results show that the DIF-LOF parallel framework proposed in this paper can effectively improve the prediction accuracy of the prediction model for cleaning abnormal photovoltaic power data and has good universality.
The effectiveness of embedding different factors that affect the photovoltaic power into the deep prediction model for photovoltaic power varies. We selected several types of related factors that affect photovoltaic power prediction and embedded representative indicators to conduct experiments on the IFTformer model with a predicted horizon of 720. R1 employs the embedding of the theoretical PV value, which shows a very strong correlation. R2 uses the embedding of global horizontal radiation, which has a strong correlation. R3 employs the embedding of temperature, which represents a strong correlation. R4 uses the embedding of relative humidity, which represents a medium correlation. R5 employs the embedding of the wind speed, which represents a very weak correlation. R6 employs the embedding of surface pressure, which represents a very weak correlation. The results of the experiment are shown in Table 5.
As illustrated in Table 5, the R1–R6 models, which incorporate a single influencing factor, exhibit poor prediction performance, with NRMSEs of 5.31–5.49% and NMAEs of 5.58–5.66%.
In general, the higher the correlation of the factors influencing the embedded model is, the higher the prediction accuracy. The two highly correlated factors influencing PV power in the embedded prediction model demonstrate higher prediction accuracy than the corresponding single influencing factors do. However, because there are fewer embedded influencing factors than embedded factors with moderate or above correlations, the prediction model cannot learn more information from them, resulting in poor prediction performance. The predictive effect of embedding four factors with moderate or above correlations is better than that of embedding factors with weak correlations, with the NRMSE and NMAE being reduced by 5.88% and 7.28%, respectively. Although the overall embeddings R1–R6 provide a substantial quantity of data for the model to learn, embeddings R5 and R6 are the most weakly correlated influences, i.e., they are equivalent to noise, consequently, leading to a decrease in prediction accuracy.
Taking the Yulara Photovoltaic Power Station data from January 1, 2023 to December 31, 2024 as an example for analysis. Training set: Validation set: Test set is 7:2:1 for IFTformer modeling and prediction analysis. In order to test the predictive performance of the proposed method in different geographical locations, three indicators, NRMSE, NMAE, and R2, were selected for predictive performance evaluation. The results are shown in Table 6 below.
Based on the data analysis in Table 6, the prediction accuracy of the IFTformer prediction model can further prove that the model is more effective for medium and long-term prediction modeling and has universality. According to the table, NRMSE and NMAE are similar to the Alice Springs site, with an average increase of 0.67% in R2, mainly due to the enhancement of the trend and periodic components of photovoltaic power characteristics by the secondary decomposition of these two sub components. The operation of interactive attention can better capture the temporal characteristics of red and green photovoltaic power generation, and has stable performance on different datasets and time ranges.
In the field of photovoltaic (PV) power generation, the primary factors influencing performance are the randomness and variability of electricity production, especially in medium- and long-term forecasting. A complementary fusion approach combining GPU processing and XGBoost models to enhance long-term hourly solar forecasting performance60. Compared to the state-of-the-art DLinear model, this approach demonstrated a reduction of 28.3% in mean squared error (MSE) and 17.4% in mean absolute error (MAE) when predicting the next 150 steps. In contrast, the IFTformer model proposed in this study focuses on anomaly detection and improving the accuracy of medium- and long-term forecasts. It places greater emphasis on feature extraction and time series analysis, potentially offering superior performance under complex weather conditions or fluctuating environmental factors.
However, several external factors must still be considered in practical applications, such as dust accumulation, extreme weather, and smog, all of which significantly impact the efficiency and power generation capabilities of PV systems. Dust accumulation, in particular, is a critical factor in the degradation of PV system performance. Dust deposition not only directly blocks sunlight from reaching the PV modules, reducing power generation efficiency, but it can also alter the temperature characteristics of the modules, further affecting their output. Studies have shown that the physical and chemical properties of dust vary significantly across regions, making its impact on PV systems region-specific61. Dust accumulation can significantly impact photovoltaic (PV) output power, with potential reductions reaching up to 98.13%62. To address this challenge, intelligent cleaning technologies have been proposed to optimize cleaning strategies for solar panels. Similarly, Significant performance degradation in PV systems due to dust deposition can be predicted using time series methods, as emphasized by Haneen Abuzaid et al.63. Moreover, extreme weather events present a new challenge to the stability of PV power generation systems. For example, heavy rainfall, blizzards, and high-temperature weather not only impact the cleanliness of the PV panels but can also cause physical damage to the modules and degrade their electrical performance64,65. Study introduced a VMDKELM-based architecture for forecasting PV station output during adverse weather conditions66. The results showed significant reductions in PV power output during dust storms, hail, thunderstorms, and blizzards, and the accuracy of PV power forecasting was ranked from highest to lowest in turn.
In addition, other environmental factors, such as smog, tilt angle, installation orientation, and ground reflectivity also influence the performance of PV systems. These factors collectively affect the light absorption and conversion efficiency of PV modules67. Incorporating more environmental and climatic variables into the feature set of forecasting models provides a more comprehensive data foundation for long-term PV performance prediction. For instance, combining meteorological data, environmental monitoring data, and historical power generation data can improve the generalization capabilities of forecasting models68,69.
This study presents the development of an interactive deep learning framework named IFTformer for medium- to long-term PV power generation prediction. This framework incorporates a data preprocessing module, which is designed to accurately remove various types of anomalous data. Additionally, this framework incorporates a module for deeply fitting of the trend portion of the data, addressing the issue of time series lag. Furthermore, the framework incorporates a module for exploring information about temporal discrepancies between time steps with different time lengths. The model integrates data-driven and domain knowledge in a comprehensive manner, with the objective of ensuring that the prediction results are mechanistically sound and therefore more accurate.
The IFTformer model employs a neural network that is not only driven by the data but also obtain the information of the time steps preceding and following the current step. This approach has the potential to significantly increase the accuracy of predictions. On the basis of medium- to long-term PV power generation prediction with five different predicted horizons from actual PV power plants, the IFTformer model outperforms other mainstream models such as ARIMA, LSTM,GRU, XGBoost, TCN, Autoformer, Informer, DLinear, and PatchTST, achieving the higher prediction accuracy. The average NRMSE is 3.64%, and the average NMAE is 2.44%. The experimental results revealed that the IFTformer model exhibited superior performance in medium- to long-term PV power prediction. In future research, the medium and long-term photovoltaic power prediction model can consider extreme weather, dust coverage of photovoltaic panels and other factors to more accurately predict medium and long-term photovoltaic power generation.
The data that support the findings of this study are available on request from the author Xiang Liu (2022240147@stu.syau.edu.cn) upon reasonable request.
Hassan, Q. et al. A comprehensive review of international renewable energy growth. Energy Built Environ. https://doi.org/10.1016/j.enbenv.2023.12.002 (2024).
Article Google Scholar
International Renewable Energy Agency (IRENA). Report, (2024).
Khalid, H. M. et al. Dust accumulation and aggregation on PV panels: An integrated survey on impacts, mathematical models, cleaning mechanisms, and possible sustainable solution. Solar Energy 251, 261–285 (2023).
Article ADS Google Scholar
Aljdaeh, E. et al. Performance enhancement of self-cleaning hydrophobic nanocoated photovoltaic panels in a dusty environment. Energies 14, 6800 (2021).
Article CAS Google Scholar
Rehman, A. U., Ullah, Z., Qazi, H. S., Hasanien, H. M. & Khalid, H. M. Reinforcement learning-driven proximal policy optimization-based voltage control for PV and WT integrated power system. Renew. Energy 227, 120590 (2024).
Article Google Scholar
Guermoui, M., Bouchouicha, K., Bailek, N. & Boland, J. W. Forecasting intra-hour variance of photovoltaic power using a new integrated model. Energy Convers. Manag. 245, 114569 (2021).
Article Google Scholar
Li, N. et al. Research on short-term photovoltaic power prediction based on multi-scale similar days and ESN-KELM dual core prediction model. Energy 277, 127557 (2023).
Article MATH Google Scholar
Sabadus, A. et al. A cross-sectional survey of deterministic PV power forecasting: Progress and limitations in current approaches. Renew. Energy 226, 120385 (2024).
Article MATH Google Scholar
Jung, Y., Jung, J., Kim, B. & Han, S. Long short-term memory recurrent neural network for modeling temporal patterns in long-term power forecasting for solar PV facilities: Case study of South Korea. J. Clean. Prod. 250, 119476 (2020).
Article MATH Google Scholar
Guijo-Rubio, D. et al. Evolutionary artificial neural networks for accurate solar radiation prediction. Energy 210, 118374 (2020).
Article MATH Google Scholar
Mayer, M. J. Benefits of physical and machine learning hybridization for photovoltaic power forecasting. Renew. Sustain. Energy Rev. 168, 112772 (2022).
Article MATH Google Scholar
Lin, G.-Q. et al. An improved moth-flame optimization algorithm for support vector machine prediction of photovoltaic power generation. J. Clean. Prod. 253, 119966 (2020).
Article Google Scholar
Skoplaki, E. & Palyvos, J. A. On the temperature dependence of photovoltaic module electrical performance: A review of efficiency/power correlations. Solar Energy 83, 614–624 (2009).
Article ADS CAS Google Scholar
Xu, H., Pang, G., Wang, Y. & Wang, Y. Deep isolation forest for anomaly detection. IEEE Trans. Knowl. Data Eng. 35, 12591–12604 (2023).
Article MATH Google Scholar
Dogmechi, S., Torabi, Z. & Daneshpour, N. An outlier detection method based on the hidden Markov model and copula for wireless sensor networks. Wirel. Netw. 30, 4797–4810 (2024).
Article MATH Google Scholar
Khazaei, S., Ehsan, M., Soleymani, S. & Mohammadnezhad-Shourkaei, H. A high-accuracy hybrid method for short-term wind power forecasting. Energy 238, 122020 (2022).
Article Google Scholar
Breunig, M. M., Kriegel, H.-P., Ng, R. T., Sander, J. LOF: Identifying density-based local outliers. (n.d.).
Li, G. et al. Outlier data mining method considering the output distribution characteristics for photovoltaic arrays and its application. Energy Rep. 6, 2345–2357 (2020).
Article MATH Google Scholar
Li, H. et al. Algorithm of vehicle’s data cleaning and monitoring. J. Phys. Conf. Ser. 1828, 012052 (2021).
Article MATH Google Scholar
Hu, A. et al. A novel photovoltaic array outlier cleaning algorithm based on sliding standard deviation mutation. Energies 12, 4316 (2019).
Article MATH Google Scholar
Mayer, M. J. & Gróf, G. Extensive comparison of physical models for photovoltaic power forecasting. Appl. Energy 283, 116239 (2021).
Article MATH Google Scholar
Hove, T. A method for predicting long-term average performance of photovoltaic systems. Renew. Energy 21, 207–229 (2000).
Article CAS MATH Google Scholar
Zhang, L., Wang, J., Qian, Y. & Li, Z. Photovoltaic power prediction system based on multi-stage data processing strategy and improved optimizer. Appl. Math. Modell. 132, 226–251 (2024).
Article MATH Google Scholar
Louzazni, M., Mosalam, H., Khouya, A. & Amechnoue, K. A non-linear auto-regressive exogenous method to forecast the photovoltaic power output. Sustain. Energy Technol. Assess. 38, 100670 (2020).
Google Scholar
Ding, S. A novel adaptive discrete grey model with time-varying parameters for long-term photovoltaic power generation forecasting. Energy Convers. Manag. 227, 113644 (2021).
Article MATH Google Scholar
Wang, H. et al. Solar irradiance forecasting based on direct explainable neural network. Energy Convers. Manag. 226, 113487 (2020).
Article Google Scholar
Zeng, J. & Qiao, W. Short-term solar power prediction using a support vector machine. Renew. Energy 52, 118–127 (2013).
Article MATH Google Scholar
Ni, Q., Zhuang, S., Sheng, H., Kang, G. & Xiao, J. An ensemble prediction intervals approach for short-term PV power forecasting. Solar Energy 155, 1072–1083 (2017).
Article ADS MATH Google Scholar
Anagnostos, D. et al. A method for detailed, short-term energy yield forecasting of photovoltaic installations. Renew. Energy 130, 122–129 (2019).
Article Google Scholar
Mazzeo, D. et al. Artificial intelligence application for the performance prediction of a clean energy community. Energy 232, 120999 (2021).
Article Google Scholar
Lee, D. & Kim, K. Recurrent neural network-based hourly prediction of photovoltaic power output using meteorological information. Energies 12, 215 (2019).
Article MATH Google Scholar
Wang, L. et al. Accurate solar PV power prediction interval method based on frequency-domain decomposition and LSTM model. Energy 262, 125592 (2023).
Article Google Scholar
Long, H., Zhang, C., Geng, R., Wu, Z. & Gu, W. A combination interval prediction model based on biased convex cost function and auto-encoder in solar power prediction. IEEE Trans. Sustain. Energy 12, 1561–1570 (2021).
Article ADS MATH Google Scholar
Kong, W., Jia, Y., Dong, Z. Y., Meng, K. & Chai, S. Hybrid approaches based on deep whole-sky-image learning to photovoltaic generation forecasting. Appl. Energy 280, 115875. https://doi.org/10.1016/j.apenergy.2020.115875 (2020).
Article Google Scholar
Wang, X. & Ma, W. A hybrid deep learning model with an optimal strategy based on improved VMD and transformer for short-term photovoltaic power forecasting. Energy 295, 131071 (2024).
Article Google Scholar
Li, G. et al. Research on a novel photovoltaic power forecasting model based on parallel long and short-term time series network. Energy 293, 130621 (2024).
Article Google Scholar
Moon, J. A multi-step-ahead photovoltaic power forecasting approach using one-dimensional convolutional neural networks and transformer. Electronics 13, 2007 (2024).
Article MATH Google Scholar
Wu, Z. et al. Prediction of photovoltaic power by the informer model based on convolutional neural network. Sustainability 14, 13022 (2022).
Article Google Scholar
Jiang, Y. et al. Ultra-short-term PV power prediction based on Informer with multi-head probability sparse self-attentiveness mechanism. Front. Energy Res. 11, 1301828 (2023).
Article Google Scholar
Xue, Y., Guan, S. & Jia, W. PMformer: A novel informer-based model for accurate long-term time series prediction. Inf. Sci. 690, 121586 (2025).
Article Google Scholar
Xu, W., Li, D., Dai, W. & Wu, Q. Informer short-term pv power prediction based on sparrow search algorithm optimised variational mode decomposition. Energies 17, 2984 (2024).
Article MATH Google Scholar
Zhao, H. et al. CPTCFS: CausalPatchTST incorporated causal feature selection model for short-term wind power forecasting of newly built wind farms. Int. J. Electr. Power Energy Syst. Electr. Power Energy Syst. 160, 110059 (2024).
Article MATH Google Scholar
Gao, Y. et al. Ultra-short-term wind power prediction based on the ZS-DT-PatchTST combined model. Energies 17, 4332 (2024).
Article MATH Google Scholar
José, M. & Patrícia, R. Evaluating the effectiveness of time series transformers for demand forecasting in retail. Mathematics 12, 2728 (2024).
Article MATH Google Scholar
Ding, S., Li, R. & Tao, Z. A novel adaptive discrete grey model with time-varying parameters for long-term photovoltaic power generation forecasting. Energy Convers. Manag. 227, 113644 (2021).
Article MATH Google Scholar
Rivero-Cacho, A., Sanchez-Barroso, G., Gonzalez-Dominguez, J. & Garcia-Sanz-Calcedo, J. Long-term power forecasting of photovoltaic plants using artificial neural networks. Energy Rep. 12, 2855–2864 (2024).
Article Google Scholar
Yuan, L., Wang, X., Sun, Y., Liu, X. & Dong, Z. Y. Multistep photovoltaic power forecasting based on multi-timescale fluctuation aggregation attention mechanism and contrastive learning. Electr. Power Energy Syst. 164, 110389 (2025).
Article MATH Google Scholar
Gao, M., Li, J., Hong, F. & Long, D. Day-ahead power forecasting in a large-scale photovoltaic plant based on weather classification using LSTM. Energy 187, 115838 (2019).
Article MATH Google Scholar
Cruz, R. M. O., Sabourin, R. & Cavalcanti, G. D. C. Dynamic classifier selection: Recent advances and perspectives. Inf. Fus. 41, 195–216 (2018).
Article MATH Google Scholar
Zhou, Y., Zhou, N., Gong, L. & Jiang, M. Prediction of photovoltaic power output based on similar day analysis, genetic algorithm and extreme learning machine. Energy 204, 117894 (2020).
Article MATH Google Scholar
Lorenz, E. & Heinemann, D. Prediction of solar irradiance and photovoltaic power. In Comprehensive Renewable Energy 239–292 (Elsevier, 2012). https://doi.org/10.1016/B978-0-08-087872-0.00114-1.
Chapter MATH Google Scholar
Zhang, J., Tan, Z. & Wei, Y. An adaptive hybrid model for day-ahead photovoltaic output power prediction. J. Clean. Prod. 244, 118858 (2020).
Article MATH Google Scholar
Kisvari, A., Lin, Z. & Liu, X. Wind power forecasting – A data-driven method along with gated recurrent neural network. Renew. Energy 163, 1895–1909 (2021).
Article MATH Google Scholar
Dai, Y., Zhou, Q., Leng, M., Yang, X. & Wang, Y. Improving the Bi-LSTM model with XGBoost and attention mechanism: A combined approach for short-term power load prediction. Appl. Soft Comput. 130, 109632 (2022).
Article MATH Google Scholar
Limouni, T., Yaagoubi, R., Bouziane, K., Guissi, K. & Baali, E. H. Accurate one step and multistep forecasting of very short-term PV power using LSTM-TCN model. Renew. Energy 205, 1010–1024 (2023).
Article Google Scholar
Wu, H., Xu, J., Wang, J., Long, M. Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. (n.d.).
Zhou, H. et al. Informer: Beyond efficient transformer for long sequence time-series forecasting. AAAI 35, 11106–11115 (2021).
Article MATH Google Scholar
Zeng, A., Chen, M., Zhang, L. & Xu, Q. Are transformers effective for time series forecasting?. AAAI 37, 11121–11128 (2023).
Article MATH Google Scholar
Peng, Y., Wang, Z., Castillo, I., LaGrande, G. & Jiang, S. A new modeling framework for real-time extreme electricity price forecasting. IFAC-PapersOnLine 58(14), 899–904 (2024).
Article MATH Google Scholar
Xu, Y. et al. A complementary fused method using GRU and XGBoost models for long-term solar energy hourly forecasting. Expert Syst. Appl. 254, 124286 (2024).
Article Google Scholar
Yao, W. et al. Analysis of dust deposition law at the micro level and its impact on the annual performance of photovoltaic modules. Energy 306, 132448 (2024).
Article MATH Google Scholar
Wan, L., Zhao, L., Xu, W., Guo, F. & Jiang, X. Dust deposition on the photovoltaic panel: A comprehensive survey on mechanisms, effects, mathematical modeling, cleaning methods, and monitoring systems. Solar Energy 268, 112300 (2024).
Article Google Scholar
Abuzaid, H., Awad, M., Shamayleh, A. & Alshraideh, H. Predictive modeling of photovoltaic system cleaning schedules using machine learning techniques. Renew. Energy 239, 122149 (2024).
Article Google Scholar
Micheli, L., Almonacid, F., Bessa, J. G., Fernández-Solas, Á. & Fernández, E. F. The impact of extreme dust storms on the national photovoltaic energy supply. Sustain. Energy Technol. Assess. 62, 103607 (2024).
Google Scholar
Bamisile, O., Acen, C., Cai, D., Huang, Q. & Staffell, I. The environmental factors affecting solar photovoltaic output. Renew. Sustain. Energy Rev. 208, 115073 (2025).
Article Google Scholar
Zhao, Y., Wang, B., Wang, S., Xu, W. & Ma, G. Photovoltaic power generation power prediction under major extreme weather based on VMD-KELM. Energy Eng. 121(12), 3711–3733 (2024).
Article MATH Google Scholar
Ye, Y. et al. Evaluating the geographical, technical and economic potential of wind and solar power in China: A critical review at different scales. Sustain. Energy Technol. Assess. 72, 104037 (2024).
Google Scholar
Al-Dahidi, S., Hammad, B., Alrbai, M. & Al-Abed, M. A novel dynamic/adaptive K-nearest neighbor model for the prediction of solar photovoltaic systems’ performance. Results Eng. 22, 102141 (2024).
Article MATH Google Scholar
Carpentieri, A., Folini, D., Leinonen, J. & Meyer, A. Extending intraday solar forecast horizons with deep generative models. Appl. Energy 377, 124186 (2025).
Article Google Scholar
Download references
The Key Project of the National Natural Science Foundation of China (U23B20118).
College of Information and Electrical Engineering, Shenyang Agricultural University, Shenyang, 110866, China
Xiang Liu, Qingyu Liu, Shuai Feng, Haoran Chen & Chunling Chen
Electric Power Research Institute, State Grid Liaoning Electric Power Co., Ltd, Shenyang, 110055, China
Yangyang Ge
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
X.L. and Q.L. carried out the experiment and wrote the main manuscript with support from S.F. and C.C. Y.G. provided the data collection. H.C. carried out the experiment result analysis. S.F. and C.C. supervised the project. All authors reviewed the manuscript.
Correspondence to Shuai Feng or Chunling Chen.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
Liu, X., Liu, Q., Feng, S. et al. Novel model for medium to long term photovoltaic power prediction using interactive feature trend transformer. Sci Rep 15, 6544 (2025). https://doi.org/10.1038/s41598-025-90654-4
Download citation
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-90654-4
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Discover Artificial Intelligence (2025)
Advertisement
Scientific Reports (Sci Rep)
ISSN 2045-2322 (online)
© 2025 Springer Nature Limited
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.