A multi-step electricity prediction model for residential buildings based on ensemble Empirical Mode Decomposition technique

. Residential electricity demand is increasing rapidly, constituting about a quarter of total energy consumption. Electricity demand prediction is one of the sustainable solutions to improve energy ef ﬁ ciency in real-world scenarios. The non-linear and non-stationary consumption patterns in residential buildings make electricity prediction more challenging. This paper proposes a multi-step prediction approach that ﬁ rst conducts cluster analysis to identify seasonal consumption patterns. Secondly, an improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) method and autoencoder model has been deployed to remove irregular patterns, noise, and redundancy from electricity load time series. Finally, the Long Short-Term Memory (LSTM) model has been trained to predict electricity consumption by considering historical, seasonal, and temporal data dependencies. Further, experimental analysis has been conducted on real-time electricity consumption datasets of residential buildings. The comparative results reveal that the proposed multi-step model outperformed the existing state-of-the-art RF-LSTM-based prediction model and attained higher accuracy.


Introduction
Buildings and industry are currently the largest power consumers, accounting for more than 90% of global electricity consumption, as per the report of the International Energy Agency (IEA), 2022 [1].Moreover, the energy consumption industry in India is anticipated to increase from 20% to roughly 50% by 2040.According to the report of IEA, 2022 [1], the industrial sector consumes the maximum electricity (44%), followed by the residential sector (24%), which is a quarter of the total energy consumption as seen in Figure 1.The alarmingly rising energy consumption is a serious concern for energy suppliers and utility companies.Thus, energy efficiency techniques must be developed to reduce electricity consumption.Predicting energy consumption plays a vital role in increasing energy efficiency.It is a foundation for many energy management, monitoring, and optimization methods that provide residential buildings usage patterns [2].In the past decade, many researchers have proposed electricity demand forecasting techniques [3] based on machine learning algorithms such as Support Vector Regressor (SVR) [4], Random Forest [5], Decision Trees (DT) [6], Artificial Neural Network (ANN) [7], Convolutional Neural Networks (CNN) [8] and Multi-Layer Perceptron (MLP) [9].However, the traditional machine learning approaches suffer from substantial deficits, such as non-adaptability, inability to handle longterm dependencies, and inaccurate predictions [10].Among these algorithms, neural networks show promising results in predictive analysis, anomaly detection, and pattern recognition [11,12].However, these models may have certain problems, such as over-fitting, hyperparameter selection, significant training time, etc.To overcome these deficiencies, a few authors [13][14][15] have proposed hybrid approaches that combine data decomposition techniques with prediction models.Still, there is a need to develop an improved approach to investigate and predict electricity consumption in real-world scenarios.The present research proposes a multi-step prediction approach that integrates the season-wise cluster analysis and Improved Complete Ensemble Empirical Mode Decomposition with the Adaptive Noise (I-CEEMDAN) method, autoencoder, and Long Short-Term Memory (LSTM) model.The paper is organized as follows: Section 2 discusses the background of the energy prediction models.Further, Section 3 explains the methodology of the proposed approach, and Section 4 discusses the experimental results obtained by the proposed prediction model.Finally, Section 5 provides the conclusion of the proposed work.

Background
Several researchers have done significant research to predict energy consumption in residential buildings by deploying various machine learning and deep learning techniques [16].Some authors have used data clustering techniques to analyze energy consumption patterns and trends.The following subsection summarizes the latest energy prediction research using data clustering, machine learning, and hybrid techniques.

Energy prediction: cluster-based approaches
Few authors have employed data clustering algorithms to get meaningful insights out of the energy consumption scenarios.Kaur and Bala [17] have proposed an energy prediction technique based on K-means clustering to fetch energy usage patterns of home appliances in residential buildings.RF model has been trained for predictive analysis using the climate conditions along with energy consumption data of home appliances.Chinthavali et al. [18] have identified similar weather day/weak pairs to compare the energy cost with and without applying optimization.Verma et al. [19] have proposed an energy consumption optimization approach for various home appliances grouped into clusters based on similar usage behavior.Further, Luo et al. [20] have performed feature extraction on weather data using K-means clustering and created weather clusters.Later, the authors predicted the week-ahead hourly energy consumption by employing GA-DNN.Season-wise cluster formation has been proposed by Bedi et al. [21] based hierarchical clustering algorithm.The extracted clusters have been used for energy prediction of different seasons of the year.Therefore, the extracted cluster data can be used to develop an energy prediction model using machine-learning approaches.The subsequent section explored various machine learning techniques to predict energy consumption.

Energy prediction: machine learning-based approaches
Several machine learning algorithms have been widely adopted to predict the energy consumption of residential as well as other buildings.Jain et al. [10] have proposed an energy forecasting model based on the SVM algorithm for multi-family buildings and concluded that the spatial and temporal features improved the predictive performance.They also suggested the necessity of the installation of smart meters to get high-resolution energy consumption data.The authors, Wahid et al. [22], have developed an energy consumption prediction for residential buildings using MLP and RF for appliance classification.They analyzed the onoff times of home appliances based on electrical usage data.Huber et al. [23] also analyzed the on-off times of home appliances and predicted the energy using histograms, pattern search, and Bayesian algorithms.Tiwari et al. [2] have deployed logistic regression, decision tree, Support Vector Machine (SVM), naive Bayes, RF, and k-nearest neighbor algorithms for energy prediction of smart grids and determined that SVM outperformed in accuracy.
Besides traditional machine learning algorithms, the authors [24] have proposed a neural network-based energy prediction model.They have optimized the neural networks using the shark smell optimization algorithm.The hybrid prediction model has been used to estimate the energy load of small-scale buildings.Fan et al. [25] have proposed deep learning-based models to construct the features automatically and applied fully connected and convolutional autoencoders to improve energy predictions.Bourhnane et al. [7] have implemented an energy prediction and scheduling approach for smart buildings using ANN and genetic algorithms.A big data analytics-based energy prediction model has been proposed by Kumari et al. [26].The LSTM model and the genetic algorithm have been applied to estimate the energy consumption of residential buildings.Furthermore, for individual household appliances, Kaur et al. [27] have proposed an intelligent energy prediction and optimization approach based on an LSTM model and genetic algorithm.

Energy prediction: hybrid approaches
Some authors have integrated the data decomposition techniques with the prediction model to achieve optimal performance.The effectiveness of the decomposition techniques can be seen in their results [14,28].For instance, An et al. [29]  Even though the EMD improved the prediction performance, but reconstructed signal or the aggregated predictions include residual noise.To resolve the issue, Wu and Huang [30] have proposed an Ensemble Empirical Mode Decomposition (EEMD) method in which white noise was added to eliminate the mode mixing problem.However, EEMD suffers from high computational time.Colominas et al. [31] proposed CEEMDAN with improved decomposition ability and reduced computational time.It adds an adaptive noise at each level of decomposition.Chai et al. [15] have proposed a hybrid feature-driven ensemble forecasting model based on extreme learning machine and particle swarm optimization.The time-series data has been decomposed and reconstructed by Variational Mode Decomposition (VMD) and sample entropy algorithm.Karijadi et al. [14] utilized CEEMDAN to decompose the non-stationary time series signals.Next, RF and LSTM models have been deployed to predict each extracted IMF.
The above literature review has emphasized the significance of improving energy prediction performance.However, the energy prediction of real-time residential buildings becomes critical due to non-linear fluctuating energy consumption patterns.Moreover, seasonal variations, usage patterns, and the number of occupants have mainly influenced energy consumption in buildings.The research challenges and novel contributions of the proposed work are described as follows:

Research challenges and our contributions
Several authors have implemented clustering to observe the energy consumption scenarios in the residential sector [17,21,32].Still, the impact of seasonal variations of energy consumption needs to be explored using a real-time environment.In this paper, real-time data clustering has been done to analyze the effect of climatic conditions on electricity consumption patterns in residential buildings.The non-linear and fluctuating nature of the time series dataset makes the energy prediction task challenging [14].The present work handles the load fluctuations and non-linearity by decomposing the original real-time dataset into a set of Intrinsic Mode Functions (IMFs) using an improved CEEMDAN method.For each extracted mode components IMF 1 ; IMF 2 ; :::IMF n , the autoencoder model is deployed to reconstruct the decomposed signal.Further, the reconstructed data provided by the autoencoder model is used by the LSTM model to learn the non-linear features and underlying patterns accurately to improve its prediction performance [13].The proposed work integrates the LSTM model with an improved CEEMDAN method.The sliding window approach has been used to generate an input window and feed it into the LSTM model to address long-term data dependencies.Most authors have adopted noise-free static, benchmark, or public energy consumption datasets to evaluate prediction models [11,33].The proposed work exploits the real-time electricity consumption dataset to evaluate the hybrid prediction model.

Proposed methodology
The proposed work aims to predict the energy demand of residential buildings using real-time electricity consumption data.Electricity demand prediction is driven by the correctness and reliability of historical data.Real-time electricity data collection is affected by smart meters malfunctioning, changing weather conditions, communication issues, etc.These factors may create undesired noise and uncertainties in the electricity consumption dataset.The present work considers the seasonality of data and identifies similar energy consumption patterns using Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) clustering, which preprocesses and summarizes the whole dataset.Further, the proposed work deploys an I-CEEMDAN-based noise removal approach to decompose the electricity consumption signals into sub-signals.For each extracted sub-signal, the autoencoder model performs the reconstruction of the decomposed input signal.Ultimately, the LSTM neural network model will be applied to estimate the electricity demand.The methodology to implement the proposed research is illustrated in Figure 2.Each module of the proposed work is detailed in the following subsections.

Real-time dataset description and preprocessing
The proposed work has exploited real-time data for estimating future electricity demand.Real-time data is more actionable and reliable and exhibits unpredictable events, weather changes, and changing user behavior [16].The prediction models developed for real-time data are more adaptive to a wide range of scenarios.The real-world residential buildings' electricity consumption dataset has been taken from Punjab State Power Corporation Limited (PSPCL), Punjab, India [34].The dataset recorded the actual energy consumed (kWh) by consumers of residential buildings for a period of 1 year.The dataset consists of multi-family and single-family residential buildings.While the selected buildings exhibit heterogeneous consumption patterns and trends, as seen in Figure 3.The data exploration shown in Figure 3 reveals that each building exhibits non-linear and non-stationary energy consumption scenarios.
Data preprocessing is crucial before developing a deep learning model, though it can significantly affect prediction accuracy.The electricity consumption dataset may contain missing values.The following data preprocessing steps are applied to the electricity dataset: The set of missing values is interpolated using the mean of the previous year's data values for the same time interval.Linear interpolation is adopted to estimate the missing values in the time-series data that calculates the unknown values in the same increasing or decreasing order as the previous values.
The electricity consumption measurements need to be normalized on the same scale.The Min-Max scalar is applied for feature scaling of electricity load data.The scalar converts the electricity measurements column into the range of 0, and 1 [35].For the electricity load feature P t , the new normalized feature P n is given by equation (1): The geographical and semitropical location of the state is the reason behind substantial temperature variation between different months.In the following section, seasonal clusters have been extracted for real-time residential buildings.

Data clustering
In order to perform accurate and efficient forecasting of electricity consumption patterns, cluster analysis would be helpful to provide a deep understanding of usage by examining the seasonal variation.For the given real-world scenarios, the prevailing climatic conditions are characterized by intense heat and extremely cold temperatures.As a result, the degree of variation and fluctuation can be observed in the electricity consumption of households.The present work employs a hierarchical clustering technique called BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) [36] to fetch seasonal electricity consumption patterns.BIRCH can handle a large amount of historical energy consumption data and is suitable when data points are not uniformly distributed [37].In order to identify the overall trends and patterns of data, it attempts to determine the dense and sparse regions.Using BIRCH, the individual data points are not evaluated, but a dense region of data points is treated as a single cluster.BIRCH involves grouping the data points into compact summaries called Clustering Features (CF), which are further grouped into even more compact clusters known as CF trees [38].CF is a vector of three values with the values ðN ; LS; and SSÞ, where N denotes the time series length, LS is the linear sum, and SS is the squared sum of the data points.
Algorithm 1 is used to create the seasonal clusters using a real-time electricity consumption dataset.Firstly, the threshold value is initialized, and input data is scanned.The algorithm begins with a default threshold value, screens the data, and puts data points to the tree.The initial CF tree is constructed, but if it runs out of memory, the CF tree is rebuilt by increasing the threshold value.If the number of data points is within a certain range, group dense sub-clusters into larger ones, resulting in a smaller CF tree.Two disjoint adjacent time series are merged by adding clustering features CF1 and CF1.The adjacent cluster merging is repeated until it reaches the end of the timeseries data.Eventually, when the clustering ends and the sub-sequence ðP 1 ; P 2 ; P 3 ; :::P N =c Þ is merged into cluster vector ðC 1 ; C 2 ; :::C k Þ where k is the number of clusters and c k is the newly created cluster.Hence, the output of the clustering algorithm is further used for seasonal analysis and trend analysis.

Decomposition and reconstruction
The real-time electricity consumption data exhibits complex seasonal variations, and therefore, it is necessary to extract essential features prior to modeling.The present work applies data decomposition on the electricity timeseries data using I-CEEMDAN.The objective is to effectively capture the temporal dynamics and seasonality inherent in the data [31].I-CEEMDAN has addressed EEMD's mixing mode issue and reduced computation time to a certain level, including white noise adaptively during decomposition.The original non-stationary time series energy dataset is divided into stationary components by I-CEEMDAN, known as IMFs, to enhance prediction accuracy.I-CEEMDAN divides the original time series signals using the estimated local means of time series plus noise signals.Further, it finds the difference between the average local means and current residues to minimize the residual noise.The time-series decomposition problem is described using the following steps: 1.The electricity time series is represented as P(t) where t is the time-stamp.2. Create the different ensemble series and for each ensemble series P(t), add white noise w i using equation ( 2) 3. For each P i ðtÞ apply Empirical Mode Decomposition into IMFs using (3) and the residual noise using equation ( 4): Step 3 is repeated for k ¼ 2; 3:::K until the residual r(t) contain non-stationary series.The final residual R(t) can be obtained by equation ( 5): To ensure the completeness of the decomposition method, the original data is reconstructed using the following equation ( 6) The decomposition performed by the I-CEEMDAN method obtained IMF 1 ; IMF 2 ; :::IMF n mode components.The first mode component IMF 1 contains error, irregularity, and redundancy that must be addressed for accurate predictive performance.Therefore, this paper adopted the autoencoder model to train and reconstruct the original sub-signal.Autoencoders are a kind of neural network developed to map the input signal data into latent space representation and reconstruct the original input signal from encoded data [39].The architecture of the autoencoder is split into two parts, as shown in Figure 2, namely the encoder and decoder network: Encoder: It is trained to encode the IMFs into lower dimensional latent space representation, given by: Decoder: It is trained to reconstruct the original input from the encoded representation, given by: where During the encoding-decoding process, the redundant features, noise, and irregular patterns have been removed.

Data prediction
To predict the electricity consumption of residential buildings, a neural network model LSTM is adopted as it can effectively deal with time-series data.The LSTM model is extensively employed in the domain of time-series prediction problems since it can learn from preceding time steps [21].It is capable of modeling long-term sequential dependencies between the time-series data using the concept of gates into the cell states [14].The architecture of LSTM consists of four neural network components: forget gate, input gate, cell, and output gate [40].In the first step, it calculates the activation value of the forget gate f t using input P t to determine the irrelevant information and remove the information that is no longer useful in the previous cell state C tÀ1 .It is represented by equation (9) given below [40]: Season-wise clustering of electricity consumption time-series data using BIRCH Input: Time series dataset (P t as p t ) and C = data points in each cluster, K = Max clusters Output: Season-wise energy clusters Begin Delete CF jþ1 from CF update dðp i ; p iþ1 Þ and dðp j ; p jþ1 Þ end while end function To update and produce an updated cell state C t , a vector of new candidate values e C t is used.The cell is responsible for retaining the information for a long time, and the formula is given below in equations ( 10) and (11) [14,40]: The input gate i t decides which information should be entered into the memory cell at the timestamp i, and its equation is given by equation ( 13).The updated hidden state is obtained by equation ( 12) [40] The output gate o t manages the output values of the cell state and contains a sigmoid layer to filter the output.Further, the updated cell state C t is forwarded to tanh, which normalizes the values between (À1) and ( 1).The final output of the cell has been obtained by multiplying the output and new cell state that is given by equation ( 14) below [14] In the above equations, P t depicts the input value while o t is the output value at the current time t.The procedure to implement the proposed hybrid prediction algorithm has been stated in Algorithm 2. Using the Algorithm, LSTM models have been trained for reconstructed electricity time-series produced by I-CEEMDAN and autoencoder model.The general approach is to build an LSTM model using a one-dimensional array (x 1 ; x 2 ; x 3 ; :::x n ) to predict the data points.However, this input format is not valid for predicting the time series data.Accordingly, input data has been transformed into a threedimensional input matrix (P,T,D) where P indicates the input data points, T denotes the series length, and D indicates input features.The three-dimensional dataset P t ðP i ; t p ; P o Þ has been prepared where P t ; P tÀ1 ; P tÀ2 ; P tÀ3 :::P tÀ15 where P i denotes the number of input samples, t p indicates the historical timestamp and P o depicts the prediction output.Certainly, the lagged parameters play a vital role in historical datasets with seasonality to get accurate predictions.Apart from this, the hyperparameters should be selected very carefully because these can impact the performance of the LSTM model.

Hyper-parameters selection:
The LSTM model consists of several hyperparameters and the appropriate values are determined such as two LSTM layers having 64 neuron units, a dense layer, 100 epochs, 64 batch size, and tanh as activation function.However, these hyperparameters are chosen while training the LSTM model iteratively until it gets accurate predictions.The hidden neuron values 16, 32, and 64 were tested, but it concluded that 64 neuron units gave better results.An adaptive learning-based optimization algorithm, ADAM optimizer [41], is used to train the proposed prediction model with a 0.01 learning rate.ADAM optimizer has fast computation time and performs better than other optimizers.Sliding window size: The optimal input/output window size must be chosen to achieve precise predictions.The input and output window size is based on the prediction horizon.In the proposed work, input_window size is 15, and output_window size is 1; although different input_window sizes such as 7, 10, and 30 have been verified, optimal results have been obtained with 15.
The proposed work applied sliding window approach [42] that provides the previous timestamp values to the LSTM model by splitting the time series data (size N) into (N À out window À in window) subsequences of length (out window þ in window).The sliding window moves over the entire dataset subsequently, and this process of iterating over input_window and out_window goes on until it reaches the last window.

Performance metrics
The proposed multi-step prediction model is assessed using state-of-the-art statistical measures such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE).The performance metrics are computed using the following equations ( 15)-( 17): where n i is the number of power measurements, p t is the actual values in the test set and p t is the predicted electricity of residential buildings.

Experimental results
The proposed work has been implemented on a real-time electricity dataset provided by PSPCL, Punjab.The electricity consumption dataset is stored in a Comma-Separated-Value (CSV) file format.The training and testing experiments are performed on an Intel Core i5 with 16 GB RAM and Windows 10 operating system.

Cluster analysis
To achieve precise and realistic predictions of energy consumption trends, data clustering has been performed to extract more detailed and specific information pertaining to electricity usage.The aim of cluster analysis is to provide a better understanding of data during different days, weeks, and months of the year.Data clustering has explored seasonal variation in electricity consumption due to changing weather conditions throughout the year.The results and findings of the proposed data clustering approach have been discussed below.
Seasonal analysis: The data clustering phase extracted three (summer, rain, and winter) weather clusters.The cluster distribution throughout the year is shown in Figure 4.It represents that cluster 1 (winter season cluster) is mainly distributed in October end, November, December, January, February, and the beginning of March.Meanwhile, cluster 2 (summer season cluster) is distributed between early March, April, May, and June.However, cluster 3 (rainy season cluster) is spread across July, August, September, and mid-October.This analysis signifies that the weather conditions are heterogeneous throughout the year, influencing energy consumption predictions.
Cluster-wise trend analysis: The output of the clustering algorithm has been used to depict the energy consumption patterns within each cluster.This paper considers individual residential buildings for trend analysis of electricity usage during different months of the year, as shown in Figure 5.It is evident from Figure 5 that the energy consumption trend varies according to changing weather conditions throughout the year.
In the subsequent stage, cluster analysis results were used to extract different frequency components from electrical consumption data.

Decomposition and reconstruction analysis
After performing seasonal and cluster analysis, the next step is to decompose the electricity time series data into several sub-signals and residuals using the I-CEEMDAN algorithm.Data decomposition revealed the underlying patterns and trends in load time series data as shown in Figure 2. It separated the seasonal patterns, trends, and residual noise, which is essential for prediction accuracy.The decomposition algorithm has been implemented using the pyEMD [43] package.The data decomposition process obtained seven sub-signals ðIMF 1 ; IMF 2 :::IMF 7 Þ, and an example decomposition is shown in Figure 2, arranged from highest to lowest frequency range.The first mode component IMF 1 shows highly irregular patterns, while IMF 2 to IMF 8 represents periodic patterns, and the last IMF 9 depicts the general trend of energy consumption.Next, the autoencoder model is built and trained to reconstruct the original signal for the extracted mode components.The autoencoder model merges the meaningful sub-signals and obtains a noise-free series of electricity consumption features.The aforementioned process is repeated for five residential buildings' electricity consumption datasets.

Prediction results and analysis
The objective of the present work is to predict the daily electricity consumption of residential buildings.LSTMbased prediction model has been built and trained for the given decomposed and seasonal data.To demonstrate the effectiveness of the multi-step prediction approach, four other state-of-the-art models have been trained to predict the electricity consumption of individual residential buildings.The widely used statistical measures such as MAE, RMSE, and MSE have been used to verify the prediction performance.Figure 6 visually represents the predicted electricity consumption in individual residential buildings where the x-axis and y-axis denote the number of data points and electricity consumption values, respectively.The orange line shows estimated electricity demand, whereas the blue line indicates actual consumption.Table 1 presents the prediction performance of five models on five residential buildings' datasets.The experimental results show that the proposed I-CEEMDAN-LSTM approach obtained a minimum MAE of 0.114 kWh, while the MAE of SVR, RF, RNN, and LSTM models are 0.195 kWh, 0.162 kWh, 0.137 kWh, and 0.131 kWh respectively while estimating the electricity demand of RB-4.For other residential buildings, like RB-1, and RB-5, the proposed model attained accurate electricity load predictions and achieved the MAE of 0.115 kWh.For residential building (RB-2), the prediction error is slightly higher than the other buildings because data points are more densely concentrated for some days and also show sudden fluctuations, as shown in Figure 3.The comparative analysis shows that the proposed multi-step prediction approach outperformed in terms of MAE, MSE, and RMSE predicting energy consumption in residential buildings.The accuracy of the proposed approach has been evaluated with state-of-theart prediction models using the percentage improvement formula.The following equations have been used to calculate the percentage improvement of MAE, RMSE, and MSE between any two prediction models: Compared to other state-of-the-art models, the percentage error improvement attained by the proposed I-CEEM-DAN-LSTM model has been presented in Table 2, which is discussed in the following subsection.

Discussion
The performance improvement of the proposed multi-step model is also compared with three state-of-the-art prediction models, namely, SVR, RF, and RNN.The following inferences can be drawn from the prediction results obtained by the proposed I-CEEMDAN-LSTM approach, and other state-of-the-models have been listed in Tables 1  and 2.
Electricity consumption load exhibits fluctuating and non-linear behavior, making it challenging to predict using a single machine learning-based model accurately.Therefore, the proposed hybrid prediction approach outperformed the given real-time scenarios.

Implications and limitations
Real-time residential buildings' electricity consumption data has been exploited to generate the historical dataset for the proposed multi-step prediction model.The electricity consumption in real-time residential buildings shows non-linear and non-stationary trends.Once the proposed prediction model has been well-trained and tested, adopting the real-time dataset, it could be deployed by the electricity distribution sector as a prediction and analytical tool.It would also be helpful to raise awareness among consumers through daily energy consumption patterns, and the consumer can link their present usage with the future cost.
Incorporating cluster analysis and data decomposition provided significant findings in real-world scenarios.The effectiveness of the proposed work relies significantly on the quantity and quality of the dataset.The quality of real-time data should be evaluated to ensure its completeness and consistency.The benchmark or simulated dataset taken from reliable sources can be utilized where real-time data collection is not feasible.

Conclusion
This paper proposed a deep learning-based multi-step approach to predict electricity consumption in the residential sector.In the first step, seasonal trend analysis has been performed to obtain season-based temporal data.The second step applied an improved CEEMDAN method to decompose the electricity consumption time series into IMFs, which removes irregular patterns, noise, and nonstationary components.Then, an autoencoder model has been implemented to reconstruct the original series using IMFs.Subsequently, the LSTM network model has been developed and trained by considering the historical, seasonal, and temporal data dependencies.The effectiveness of the proposed approach has been verified using real-time residential buildings in Punjab, India.The experimental results revealed that the proposed hybrid I-CEEMDAN-LSTM approach supports improved prediction error (MAE: 0.114) compared to the existing RF-LSTM model based on the CEEMDAN method (MAE: 0.299).Further, the proposed prediction model could also be used for other

Figure 2 .
Figure 2. Proposed electricity consumption forecasting approach for residential buildings.

Figure 3 .
Figure 3. Real-time electricity consumption data of five residential buildings.
w e , w d are weight values, b 1 , b 2 are bias values for encoder and decoder networks respectively.The autoencoder model training extracts and reconstructs the mode components generated by the I-CEEMDAN algorithm.

Figure 5 .
Figure 5. Cluster analysis of electricity consumption patterns in residential buildings during different seasons of the year.

Figure 6 .
Figure 6.Predicted and actual electricity consumption of residential buildings using hybrid improved CEEMDAN-LSTM approach.

Table 1 .
Performance of proposed I-CEEMDAN-LSTM model using real-world electricity dataset in Punjab, India (where RB is Residential Building).

Table 2 .
Improved percentage results of the proposed I-CEEMDAN-LSTM model compared to the state-of-the-art model in terms of MAE, RMSE, and MSE values.Author(s): Science and Technology for Energy Transition 79, 7 (2024)time series data that exhibit non-linear and non-stationary characteristics.The influential factors, such as building design features, time-based pricing, operational hours, and user behavior, show potential for future research and analysis. The