Longitudinal Dataset of Net-load, PV Production and Solar Irradiation from Madeira Island, Portugal – Nature

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Advertisement
Scientific Data volume 12, Article number: 1892 (2025)
2009 Accesses
Metrics details
This paper presents the PTProsumer dataset, a high-resolution dataset of photovoltaic (PV) production and net-load measurements collected from 24 prosumers – entities that both produce and consume electricity, including households and small commercial buildings – on Madeira Island, Portugal. The dataset covers monitoring periods ranging from 3 months to 5 years, with measurements sampled at a 1-second resolution, resulting in approximately 3.89 billion data points. PV production and electricity consumption are provided as separate signals, enabling detailed analysis of local energy flows. The dataset also includes 1-minute resolution solar irradiation estimates from the Copernicus Atmosphere Monitoring Service. Files are provided in CSV format, organized by year, and are accompanied by a public code repository with examples for data processing. This dataset supports research in forecasting, flexibility management, grid resilience, and energy community modeling, contributing to the growing need for open-access, long-term energy datasets.
The global energy landscape is undergoing a profound transformation with the rise of prosumers —consumers who also produce energy, most notably through residential and small commercial Solar Photovoltaic (PV) systems. By 2024, Europe’s total installed solar capacity reached 338 GW, up from 272.5 GW in 2023 and 207 GW in 20221. However, despite this overall growth, residential rooftop solar installations declined. New home installations dropped by nearly 5 GW compared to the previous year, totaling only 12.8 GW in 20241. This slowdown is attributed not only to reduced financial incentives and grid connection delays, but also to increasing regulatory uncertainty—including limitations on self-consumption and the export of surplus generation—which complicate the economic viability for individual households1,2.
These evolving regulatory and market conditions highlight the urgent need to better manage distributed assets, ensuring that prosumers can maximize the value of their generation and continue contributing to local energy resilience and flexibility. However, this demands differentiated management strategies: targeting flexible demand response mechanisms for highly variable, high-demand sites, while optimizing storage solutions or export mechanisms for prosumers with frequent surplus generation, e.g.,3,4,5.
This emerging reality highlights the importance of high-quality prosumer datasets, particularly those with long-term monitoring to accurately capture seasonal variations in consumption and generation patterns. Although numerous public datasets focus on household electricity consumption, as surveyed in several review papers, e.g.,6,7, there remains a shortage of comprehensive real-world datasets that include both PV generation and net-load measurements.
Among the few datasets offering both consumption and PV production data, PecanStreet is the most widely known8. It provides high-resolution (up to 1-second) household consumption, solar generation, and appliance-level data for thousands of U.S. households. Data collection started in 2012 and continues today, with multi-year records available for many PV sites. A consultation of the latest available metadata at the time of writing revealed that 551 of 2035 sites have both consumption and production data. However, full access is restricted since researchers must sign a data use agreement, and some datasets require payment. A smaller, free subset is available for academic research. The Smart* SunDance dataset provides hourly net-metered consumption, solar generation, and weather data for 100 solar sites across North America, covering a full year from January 2015 to January 2016, and has been utilized for studies on solar forecasting and energy modeling9. As of this writing, the SunDance dataset can be downloaded from the UMass Trace Repository (https://traces.cs.umass.edu/docs/traces/smartstar/#sundance-dataset, Accessed: Jul. 30, 2025). The StoreNet dataset provides detailed energy data from 20 households in Ireland, collected under the StoreNet project10. It includes power and energy measurements (consumption, PV generation, grid import/export, battery charging/discharging, and state of charge) along with local weather data at 1-minute resolution for the year 2020. For 10 houses, there are also PV production measurements, from systems with capacities ranging between 2 and 2.2 kWp. The HEMStoEC database offers detailed monitoring data over more than three years (January 2020 to February 2023) for four houses in southern Portugal, including high-frequency (1-second and 1-minute) records of consumption, PV and battery data, indoor climate conditions, and device operation, with synchronized datasets available at 5-minute intervals and detailed metadata on data gaps and interpolation periods11. Finally, a very recent dataset is from a residential dwelling in Estonia, providing one year (2023) of 10-second resolution PV generation and power flow measurements12.
To help address this lack of representative and reusable data, we present PTProsumer13, a dataset featuring measurements of net-load and PV production at 1-second resolution for 24 prosumers located exclusively on Madeira Island, Portugal. The data collection spans monitoring periods ranging from approximately 3 months to 5 years per prosumer. PTProsumer was collected within the scope of the H2020 Smart Island Energy Systems (SMILE) project (https://cordis.europa.eu/project/id/731249, Accessed Jul. 30, 2025), an initiative focused on demonstrating smart grid solutions for isolated energy systems, with demonstrators in three European islands, namely Madeira, Orkney, and Samsø. Additionally, the dataset was extended with irradiation data from the Copernicus Atmosphere Monitoring Service (CAMS), including Global Horizontal Irradiance (GHI), and Diffuse Horizontal Irradiance (DHI), providing a more complete characterization of the solar resource conditions during the monitoring period.
For a direct comparison among the mentioned datasets, Table 1 summarizes their key attributes, including location, number of sites, type of prosumers, dataset duration, sampling rate, and the available measurements. Among the six datasets compared, PTProsumer distinguishes itself through its combination of high-frequency (1-second) measurements, building types (including apartments, houses, and commercial spaces), and extended monitoring periods ranging from 3 months to 5 years. Covering data from 2018 to 2024 and including a wide array of electrical parameters along with solar irradiance, it complements existing datasets by offering a uniquely rich and versatile resource for studying European prosumers in real-world conditions.
Given its high temporal resolution and long-term coverage, this dataset holds significant potential for reuse across research domains that rely on smart meter data, such as demand forecasting, flexibility characterization, PV self-consumption optimization, and energy community modeling14,15,16. The fine-grained resolution enables detailed analysis of consumption and PV production dynamics, while also allowing for flexible aggregation to coarser resolutions (e.g., 15-minute and hourly), which is often required in practical applications. The extended monitoring duration further supports longitudinal analyses of prosumer behavior and seasonal patterns. The dataset’s relevance has already been demonstrated through several scientific outputs, including techno-economic analyses of PV-BESS systems17,18, storage sizing19,20, explainable AI in PV production forecasting21, and federated learning approaches for flexibility markets22.
This paper thoroughly describes how the PTProsumer dataset was collected, including detailed information on post-processing and organizing the data to form the dataset. This paper also analyzes the data quality and provides instructions for its reuse.
The data collection for Net-Load and PV production was carried out using Carlo Gavazzi (CG) smart meters. Specifically, two CG meters were DIN-mounted in the main breaker box: one for measuring grid consumption and another for PV production. Three different models were used, selected based on installation requirements.
EM111: A compact model (17.5 mm wide) that supports currents up to 45A and a maximum cable cross-section of 6mm2. This model was primarily used for PV production measurements.
EM112: Similar to the EM111 but slightly wider (35 mm), supporting currents up to 100A and cable cross-sections up to 25mm2. This device was mainly used for consumption measurements.
EM340: A three-phase smart meter supporting currents up to 100A and cable cross-sections up to 25mm2, used for three-phase installations.
Figure 1 illustrates the main components of the data collection setup. The CG meters were installed in the main breaker box, and a Raspberry Pi 3 gateway was connected to them via a serial port using the RS485 communication protocol. The gateway sequentially requested the latest measurements from each meter at a configurable scanning interval, set to one second in this setup. The timestamp of the request was shared between the two meters, ensuring that records from a single scan had the same timestamp.
Overview of the net-load and PV production data collection setup for a three-phase installation.
The collected samples were stored in a relational database on the gateway and uploaded every minute to an online database server, providing real-time access to consumption data23. Additionally, at 12 AM daily, a Comma Separated Values (CSV) file containing the day’s readings for each CG meter was uploaded to a shared folder. After successful uploads, the local database was cleared to minimize storage footprint.
Solar irradiation data was obtained using the CAMS service24, which provides high-resolution estimates under both clear-sky and all-sky conditions. The clear-sky irradiation values are produced using the CAMS global forecasting system, which incorporates atmospheric components such as aerosol optical depth, ozone concentration, and water vapor content. These components are essential in the CAMS methodology, as they influence the scattering and absorption of solar radiation in the atmosphere, thereby affecting the irradiance reaching the surface. All-sky irradiation estimates, on the other hand, are derived from satellite imagery within the field-of-view of the Meteosat Second Generation (MSG) and Himawari satellites. Irradiation data was retrieved at a 1-minute temporal resolution, aligned with the net-load and PV production measurements, and geolocated using the approximate coordinates of each prosumer—omitted here for privacy.
The recruitment of prosumers and the deployment of the data collection setup occurred in three phases: i) between March and June of 2018, ii) in October and November of 2020, and iii) between February and March 2020. The dataset comprises 24 monitored sites, representing a diverse set of prosumers, including residential houses (17), apartments (4), an office, a restaurant, and a supplier.
Ethical approval for the data collection activities was granted by the SMILE project’s Ethics Board, established at Month 3 of the project to oversee compliance with ethical and privacy requirements defined under Work Package 10 (“Ethics Requirements”). The Ethics Board reviewed and approved the procedures and consent forms used across the three project demonstrators and continuously monitored the project to ensure participant protection and data privacy. All participants signed the informed consent prior to the installation of monitoring equipment and the start of data collection. Ethical requirements and approvals are documented in the project’s internal deliverables D10.1 (H-Requirement No. 1) and D10.2 (POPD – Requirement No. 2).
In each site, the consumption meter was installed immediately downstream of the utility meter, measuring only the energy drawn from the grid (net-load), not the total household consumption. On the other hand, the PV meter was installed directly upstream of the PV inverter, capturing the total PV generation before any self-consumption or grid export occurs.
Table 2 provides detailed information on each monitored site, including the type of prosumer, connection characteristics, monitoring periods, and data completeness percentages. The monitored sites vary in grid connection phases (single-phase and three-phase) and peak power capacities, with grid-side peak power ranging from 3.45 kW to 20.7 kW and production-side peak power from 0.39 kW to 4.32 kW. The data collection periods span from March 2018 to December 2023, with most sites achieving high data completeness, typically exceeding 90%. However, certain locations, particularly apartments, exhibit lower completion rates, with some below 50%, due to interruptions in data collection as a result of hardware connectivity issues. In many apartments, the main electrical panels were located near the entrance, where the Wi-Fi signal strength was insufficient for reliable transmission. Additionally, changes to Wi-Fi credentials by residents, without notice to the research team, required on-site visits to reestablish connectivity. In 23 of the 25 sites, the completion rates of Net-load and PV production are very similar. The only exception is PR_21 (Net-load: 92.4%, PV: 73.98%), due to technical issues with the PV smart meter. Net-load data is available as single-phase for 20 sites and three-phase for 4 sites, while production data is available as single-phase for 22 sites and three-phase for 2 sites.
The PV production data was post-processed to account for small nighttime reverse currents, mitigate measurement noise from the inverter and sensors, and correct for power consumption by the monitoring equipment. These factors can introduce slight deviations in the form of small non-zero values, despite the physical impossibility of solar generation in the absence of sunlight. Using the Astral (v3.2) Python package to determine sunrise and sunset times based on each prosumer’s approximate geographical coordinates, recorded PV power values outside the daylight periods were assumed to be measurement artifacts and were consequently set to zero.
In contrast, net-load data did not require post-processing, as grid import/export measurements are generally more stable and less affected by sensor noise. Additionally, reverse currents do not impact net-load since they are naturally accounted for in grid import/export measurements, and the power consumption of monitoring equipment is negligible compared to total household consumption.
The PTProsumer dataset is made available individually for each monitored prosumer, and all the data files are in CSV format. The data is available on the Open Science Framework (OSF) data repository13. Figure 2 shows an overview of the underlying organization of the PTProsumer dataset. The following subsections describe the contents of the different files.
Underlying folder and file organization of PTProsumer Dataset.
A CSV version of Table 2 is available in the datasets’ root folder in a file named metadata.csv. This file contains 24 rows. The columns are described in Table 3.
The consumption measurements are available in individual files for each year in the Net-Load folder. The measurement files, named pr<ID>_nl_<year>.csv, are available in the root folder of each prosumer (PR_<ID>). The production files follow the same organization but instead are stored under the Production folder and titled pr<ID>_prod_<year>.csv. The columns of the CSV files are described in Table 4.
The solar irradiation measurements are available in individual files for each year in the Irradiation folder. The measurement files, named pr<ID>_irrad_<year>.csv, are available in the root folder of each prosumer (PR_<ID>). The columns of the CSV files are described in Table 5.
The net-load and PV production data availability for all PTProsumer sites can be seen in Figs. 3 and 4. The gaps indicate periods when the data were unavailable. The vertical right axis of both figures shows the data availability (%), calculated as the ratio of available data points to the total expected data points over the period each prosumer was monitored.
PTProsumer Net-load Data Availability. Gaps in the line represent an area where data was unavailable for more than a quarter of a day.
PTProsumer PV Production Data Availability. Gaps in the line represent an area where data was unavailable for more than a quarter of a day.
The monitoring campaign had varying durations and data completeness levels. The earliest installation began on March 29, 2018, while the latest started on March 13, 2020, with monitoring periods ranging from just a few months to nearly 5.5 years. Some sites, such as Site 9 (House) and Site 14 (Apartment), were monitored until December 31, 2023, providing some of the longest datasets. In contrast, Site 8 (House) was active for only three months, and Site 23 (House) for just ten months, limiting their ability to capture long-term trends.
A key observation is that long monitoring periods do not necessarily imply high data completeness. Site 17 (Apartment), despite being monitored for almost five years, suffered from significant data loss, with a completeness of just 27.2%, indicating extended periods of missing data. Similarly, Site 14 (Apartment), which also had one of the longest monitoring durations, maintained only 47.6% completeness. Conversely, some short-duration sites achieved nearly uninterrupted data collection. Site 8 (House), monitored for only three months, maintained a 98.20% completeness rate, ensuring a high-quality dataset despite its brevity. Site 7 (House), with a monitoring period of just under three years, achieved 99.40% completeness, demonstrating the stability and reliability of its data stream. Among the longest monitored sites, Site 3 (Restaurant) and Site 20 (House) stand out, as both provided four to five years of data with nearly 99% completeness.
To further understand the characteristics of each site, the histograms in Figs. 5, 6, 7 provide a statistical overview of the data by illustrating the distribution of key variables (active power for net-load and PV production, and GHI for irradiation). They aim to illustrate the distribution of values, enabling the identification of central tendencies, dispersion, and potential anomalies within the collected data. Each histogram is complemented by key statistical values such as Mean, Minimum, and Maximum, which support the interpretation of the data’s range and central tendency.
PTProsumer Net-load data distribution represented through histograms. Mean, minimum, and maximum values are indicated to support the interpretation of central tendency and range.
PTProsumer PV Power distribution represented through histograms. Mean maximum values are indicated to support the interpretation of central tendency and range.
PT Prosumer Irradiation distribution represented through histograms. Mean maximum values are indicated to support the interpretation of central tendency and range.
The net-load statistics of the 24 prosumers reveal a highly heterogeneous group, with substantial differences in consumption and generation dynamics. Several prosumers (e.g., PR_2, PR_8, PR_15) exhibit frequent periods of net-export, as indicated by negative or near-zero 25th percentile values. In contrast, others, such as PR_21 and PR_23, maintain consistently positive net-loads across all quartiles, reflecting a strong and steady demand profile. Prosumers like PR_3, PR_10, and PR_21 reach maximum net-loads exceeding 10 kW. High variability is also evident, particularly for PR_3, PR_21, and PR_23, where standard deviations approach or exceed 1 kW, while more stable profiles are seen in prosumers like PR_5 and PR_7.
The PV production profiles in the dataset demonstrate considerable diversity, with mean values ranging from approximately 150 W to over 830 W across different prosumers. Several sites, such as PR_2, PR_9, PR_11, and PR_23, recorded maximum generation peaks well above 3000 W, indicating substantial PV system sizes. The standard deviation is consistently high relative to the mean for many sites, indicating significant intra-day and inter-day variability. Percentile values further reveal that for most prosumers, production is skewed, with median values (50th percentile) often much lower than the maximum observed outputs. This highlights that, although large generation events occur, they are not dominant throughout the measurement period.
The Global Horizontal Irradiance (GHI) across the 24 prosumer sites is relatively consistent, with mean values typically between 6.3 and 7.3 W/m2, indicating uniform solar resource availability across the region. The standard deviation ranges between 4.5 and 5.2 W/m2, reflecting natural daily and seasonal solar variability. Overall, the GHI data show that differences in PV production between prosumers are more likely due to system-specific factors such as panel make and model, inverter efficiency, installation orientation and tilt, shading from nearby objects, maintenance practices, or variations in monitoring equipment, rather than significant differences in solar irradiance.
The PTProsumer dataset13 is made available in CSV format, which is usable in most scientific computing packages, e.g., Python (pandas/numpy), MATLAB (readmatrix), and R (read.csv). However, due to the large size of the raw data, it was decided to store the files in compressed format using gzip. This is a very common protocol, and the files can be easily unzipped using software such as 7-Zip, WinRAR, or the built-in gunzip utility on Linux and macOS systems. Alternatively, most scientific programming packages offer tools to decompress and read the CSV files directly; for example, in Python, libraries such as pandas, gzip, and zipfile can be used. Given the volume of data, parallel processing can also be beneficial for faster data handling. We provide example Python3 scripts and guidance for both sequential and parallel data processing in the available source code.
While there is some missing data in every PTProsumer, we deliberately did not clean or strip any data. This allows us to retain the data from its raw form as closely as possible. Furthermore, this leaves room for the dataset users to apply their preferred data-cleaning and filling methods. For example, in the case of PV production, missing data can be completed leveraging irradiation data and the information about the installed PV capacity on each site. Several libraries exist for this effect, including pvlib python25. As for the net-load, several methods can be applied depending on the number of missing points. Such strategies can vary from quick fixes such as forward/backward filling or interpolation (linear or weighted) to more advanced techniques using forecasting algorithms, such as the Auto-Regressive Integrated Moving Average (ARIMA)26.
This dataset was collected in Madeira Island in Portugal, and during the data collection, two different time zones were observed: Western European Time (WET) and Western European Summer Time (WEST), following the region’s standard daylight saving time changes. However, to simplify the dataset handling, it was decided to ignore the timezone by representing the timestamps in Coordinated Universal Time (UTC). To state more precisely, when DST starts (clocks move forward), local time skips one hour (e.g., from 1:00 AM to 2:00 AM), but since time is recorded in UTC, no data is lost—timestamps simply continue uninterrupted (e.g., 00:59:59 UTC  → 01:00:00 UTC). Similarly, when DST ends (clocks move back), local time repeats an hour (e.g., 2:00 AM occurs twice), but again, because of using UTC, there is no duplication—timestamps continue progressing forward without repeating. If needed, local time can be recovered from UTC using the known time zone of Madeira (WET/WEST) and standard libraries. For example, using Python pytz, setting the timezone to Atlantic/Madeira.
Data collection overlapped with the COVID-19 pandemic, particularly during the first strict lockdown period in Madeira Island, which lasted from March 19 to May 3, 2020. This context may have led to atypical consumption behaviors, especially due to stay-at-home orders and the closure of commercial activities.
While the dataset captures detailed net-load and PV production measurements from each prosumer, it does not include contextual information about the physical installation conditions of the PV systems. Specifically, no data were collected regarding panel orientation, tilt, or potential sources of partial shading such as nearby buildings, trees, or obstructions. As a result, variations in PV output caused by suboptimal placement or shadowing are not accounted for. This limitation should be considered when interpreting the production data or using it to develop or validate models that assume ideal or standardized PV operating conditions.
Likewise, despite the fact that the dataset includes GHI estimates obtained from the CAMS radiation service, no on-site pyranometer measurements were available to validate the modeled values. As such, potential discrepancies between modeled and ground-based radiation values are not quantified. Future deployments may incorporate on-site solar irradiation measurements to improve the accuracy and validation of the irradiance data. Additionally, future work could include comparisons with alternative forecasting services, such as Forecast.Solar, to assess the consistency and potential biases across different sources of solar irradiation estimates.
Table 6 provides a brief description of the available source code.
The dataset13 is publicly available in its own repository on the Open Science Framework at https://osf.io/9xs3a/.
The Python3 code to reproduce the examples presented in this paper is available on the dataset repository13.
SolarPower Europe, European Market Outlook for Solar Power 2024–2028, Brussels, Belgium: SolarPower Europe, (2024). Available at: https://www.solarpowereurope.org/insights/outlooks/eu-market-outlook-for-solar-power-2024-2028 (Accessed: 2025).
International Energy Agency – Photovoltaic Power Systems Program, PVPS Annual Report 2024: Country Updates, NSW, Australia: International Energy Agency, Apr. (2025). Available at: https://iea-pvps.org/wp-content/uploads/2025/04/PVPS_Annual_Report_2024_Country_Updates.pdf (Accessed: 2025).
Pamfile L. V. & Proscanu, M. E. Energy Transition: How Solar PV Would Shape the Final Household Electricity Consumption. An Economic Analysis on Sizing, Integration and Risk of Prosumer Energy Systems. in Rethinking Business for Sustainable Leadership in a VUCA World, edited by M. Busu, pp. 245–262, Springer Nature Switzerland, Cham, https://doi.org/10.1007/978-3-031-50208-8_16 (2024).
Díaz-Bello, D., Vargas-Salgado, C., Alcázar-Ortega, M. & Gómez-Navarro, T. Demand response of prosumers integrating storage system for optimizing grid-connected photovoltaics through time-pricing. Journal of Energy Storage 88, 111536, https://doi.org/10.1016/j.est.2024.111536 (2024).
Article  Google Scholar 
Benalcazar, P., Kalka, M. & Kamiński, J. From consumer to prosumer: A model-based analysis of costs and benefits of grid-connected residential PV-battery systems. Energy Policy 191, 114167, https://doi.org/10.1016/j.enpol.2024.114167 (2024).
Article  Google Scholar 
Pereira, L. & Nunes, N. Performance evaluation in non-intrusive load monitoring: datasets, metrics, and tools-a review. WIREs Data Mining and Knowledge Discovery 8(6), e1265, https://doi.org/10.1002/widm.1265 (2018).
Article  Google Scholar 
Haben, S., Arora, S., Giasemidis, G., Voss, M. & Vukadinović Greetham, D. Review of low voltage load forecasting: methods, applications, and recommendations. Applied Energy 304, 117798, https://doi.org/10.1016/j.apenergy.2021.117798 (2021).
Article  Google Scholar 
Pecan Street Project, Inc., Pecan Street Grid Demonstration Program. Final Technology Performance Report, Austin, TX (United States), https://doi.org/10.2172/1172297 (2015).
Chen, D. & Irwin, D. SunDance: black-box behind-the-meter solar disaggregation. in Proceedings of the Eighth International Conference on Future Energy Systems, pp. 45–55, https://doi.org/10.1145/3077839.3077848 (Association for Computing Machinery, New York, NY, USA, May 2017).
Trivedi, R., Bahloul, M., Saif, A., Patra, S. & Khadem, S. Comprehensive dataset on electrical load profiles for energy community in Ireland. Scientific Data 11(1), 621, https://doi.org/10.1038/s41597-024-03454-2 (2024).
Article  PubMed  PubMed Central  Google Scholar 
Ruano, A. & da Graça Ruano, M. From home energy management systems to energy communities: methods and data. Scientific Data 11(1), 346, https://doi.org/10.1038/s41597-024-03184-5 (2024).
Article  PubMed  PubMed Central  Google Scholar 
Hasan, S., Blinov, A., Chub, A. & Vinnikov, D. Solar PV generation and consumption dataset of an Estonian residential dwelling. Scientific Data 12(1), 481, https://doi.org/10.1038/s41597-025-04747-w (2025).
Article  PubMed  PubMed Central  Google Scholar 
L. Pereira, D. R. P. Monteiro, and F. Apina, Longitudinal Dataset of Net-load, PV Production and Solar Irradiation from Madeira Island, Portugal, Available at: https://doi.org/10.17605/OSF.IO/9XS3A, OSF (2025).
Wang, Y., Chen, Q., Hong, T. & Kang, C. Review of smart meter data analytics: applications, methodologies, and challenges. IEEE Transactions on Smart Grid 10(3), 3125–3148, https://doi.org/10.1109/TSG.2018.2818167 (2019).
Article  Google Scholar 
Völker, B., Reinhardt, A., Faustine, A. & Pereira, L. Watt’s up at home? Smart meter data analytics from a consumer-centric perspective. Energies 14(3), 719, https://doi.org/10.3390/en14030719 (2021).
Article  Google Scholar 
Faia, R., Goncalves, C., Gomes, L. & Vale, Z. Dataset of an energy community with prosumer consumption, photovoltaic generation, battery storage, and electric vehicles. Data in Brief 48, 109218, https://doi.org/10.1016/j.dib.2023.109218 (2023).
Article  CAS  PubMed  PubMed Central  Google Scholar 
Pereira, L., Cavaleiro, J. & Barros, L. Economic assessment of solar-powered residential battery energy storage systems: the case of Madeira Island. Portugal. Applied Sciences 10(20), 7366 (2020).
Article  CAS  Google Scholar 
Pereira, L., Cavaleiro, J. & Morais, H. Understanding the role of solar PV and battery energy storage in a snack bar: a case study in Madeira Island, in 2023 IEEE 21st International Conference on Industrial Informatics (INDIN), pp. 1–7, https://doi.org/10.1109/INDIN51400.2023.10218295 (IEEE, 2023).
Hashmi, M. U., Pereira, L., & Bušić, A. Energy storage in Madeira, Portugal: co-optimizing for arbitrage, self-sufficiency, peak shaving and energy backup. in 2019 IEEE Milan PowerTech, pp. 1–6, https://doi.org/10.1109/PTC.2019.8810531 (IEEE, 2019).
Hashmi, M. U., Cavaleiro, J., Pereira, L. & Bušić, A. Sizing and profitability of energy storage for prosumers in Madeira, Portugal. in 2020 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), pp. 1–5, https://doi.org/10.1109/ISGT45199.2020.9087772 (IEEE, 2020).
Gomes, E., Esteves, A., Morais, H. & Pereira, L. Leveraging explainable artificial intelligence in solar photovoltaic mappings: model explanations and feature selection. Energies 18(5), 1282, https://doi.org/10.3390/en18051282 (2025).
Article  Google Scholar 
Pereira, L., Nair, V., Dias, B., Morais, H., and Annaswamy, A. Federated learning forecasting for strengthening grid reliability and enabling markets for resilience. in IET Conference Proceedings 2024, pp. 246–250, https://doi.org/10.1049/icp.2024.2608 (IET, Chicago, IL, USA, 2025).
Quintal, F. et al. Energy monitoring in the wild: platform development and lessons learned from a real-world demonstrator. Energies 14(18), 5786, https://doi.org/10.3390/en14185786 (2021).
Article  Google Scholar 
Copernicus Atmosphere Monitoring Service (CAMS), CAMS solar radiation time-series. Feb. (2020). Available at: https://ads.atmosphere.copernicus.eu/datasets/cams-solar-radiation-timeseries?tab=overview (Accessed: 13-Jan-2025).
Anderson, K. S. et al. Pvlib python: 2023 project update. Journal of Open Source Software 8(92), 5994, https://doi.org/10.21105/joss.05994 (2023).
Article  ADS  Google Scholar 
Martins, R. Data-driven Modeling of Energy Consumption in Industrial Kitchens – Detection of Activations and Unsupervised Classification, Available at: https://fenix.tecnico.ulisboa.pt/cursos/mecd/dissertacao/565303595503325 (M.Sc. thesis, Técnico Lisboa, University of Lisbon, Lisbon, Portugal, Nov. 2022).
Download references
This paper is supported by the European Union’s Horizon 2020 R&I program under grant agreement no. 731249, project SMILE (Smart Island Energy Systems) and by project n 56 – “ATE”, financed by European Funds, namely “Recovery and Resilience Plan – Component 5: Agendas Mobilizadoras para a Inovação Empresarial”, included in the NextGenerationEU funding program. This research was also supported by FCT projects 10.54499/2022.15771.MIT; 10.54499/LA/P/0083/2020; 10.54499/UIDP/50009/2020; 10.54499/UIDB/50009/2020; UID/50021/2025 and UID/PRR/50021/2025, and grant CEECIND/01179/2017 (L.P.).
Interactive Technologies Institute, LARSyS, 1049-001, Lisbon, Portugal
Lucas Pereira, Frederick Apina, Sabrina Scuri, Mary Barreto & Filipe Quintal
Instituto Superior Técnico (IST), University of Lisbon, 1049-001, Lisbon, Portugal
Lucas Pereira, Diogo Monteiro & Hugo Morais
Instituto de Engenharia de Sistemas e Computadores – Investigação e Desenvolvimento, 1049-001, Lisbon, Portugal
Diogo Monteiro & Hugo Morais
University of Madeira, Faculty of Exact Sciences and Engineering, 9020-105, Funchal, Portugal
Frederick Apina, Mary Barreto & Filipe Quintal
Design Department, Politecnico di Milano, 20158, Milan, Italy
Sabrina Scuri
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
PubMed Google Scholar
Conceptualization: L.P.; methodology: L.P., D.M., and F.A.; software development: L.P., D.M., and F.A.; recruitment: L.P., F.Q., S.S., and M.B.; deployment and data collection: L.P., F.Q., S.S., and M.B.; dataset curation: L.P., D.M., and F.A.; writing original draft: L.P., D.M., and F.A. review and editing: L.P., D.M., F.A., and H.M.; visualizations: L.P., D.M., and F.A.; formal analysis: L.P., D.M., and F.A.; supervision: L.P., HM.; funding: L.P., F.Q., M.B., and H.M,; All authors have read and agreed to the published version of the manuscript.
Correspondence to Lucas Pereira.
Lucas Pereira is an Editorial Board Member for Scientific Data, handling manuscripts for the journal and advising on its policy and operations, but did not manage any component of this review.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and permissions
Pereira, L., Monteiro, D., Apina, F. et al. Longitudinal Dataset of Net-load, PV Production and Solar Irradiation from Madeira Island, Portugal. Sci Data 12, 1892 (2025). https://doi.org/10.1038/s41597-025-06118-x
Download citation
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41597-025-06118-x
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative
Advertisement
Scientific Data (Sci Data)
ISSN 2052-4463 (online)
© 2026 Springer Nature Limited
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

source

This entry was posted in Renewables. Bookmark the permalink.

Leave a Reply