Assessment of different methods for estimation of missing rainfall data
View/ Open
Access
info:eu-repo/semantics/openAccessAttribution 3.0 United Stateshttp://creativecommons.org/licenses/by/3.0/us/Date
2024Metadata
Show full item recordAbstract
Missing data is a common problem encountered in various felds, including clinical
research, environmental sciences and hydrology. In order to obtain reliable results from
the analysis, the data inventory must be completed. This paper presents a methodology for
addressing the missing data problem by examining the missing data structure and missing
data techniques. Simulated datasets were created by considering the number of missing
data, missing data pattern and missing data mechanism of real datasets containing missing values, which are often overlooked in hydrology. Considering the missing data pattern,
the most commonly used methods for missing data analysis in hydrology and other felds
were applied to the created simulated datasets. Simple imputation techniques and expectation maximization (EM) were implemented in SPSS software and machine learning techniques such as k-nearest neighbor (kNN), together with the hot-deck were implemented in
the Python programming language. In the performance evaluation based on error metrics,
it is concluded that the EM method is the most suitable completion method. Homogeneity analyses were performed in the Mathematica programming language to identify possible changes and inconsistencies in the completed rainfall dataset. Homogeneity analyses
revealed that most of the completed rainfall datasets are homogeneous at class 1 level, consistent and reliable and do not show systematic changes in time.
Source
Water Resources ManagementVolume
38Issue
15Collections
The following license files are associated with this item: