Help ?

IGMIN: We're glad you're here. Please click 'create a new query' if you are a new visitor to our website and need further information from us.

If you are already a member of our network and need to keep track of any developments regarding a question you have already submitted, click 'take me to my Query.'

Browse by Subjects

Welcome to IgMin Research – an Open Access journal uniting Biology, Medicine, and Engineering. We’re dedicated to advancing global knowledge and fostering collaboration across scientific fields.

Members

Our focus is on fostering synergy among fields and promoting the swift expansion of scientific understanding.

Articles

Our focus is on fostering synergy among fields and promoting the swift expansion of scientific understanding.

Explore Content

Our focus is on fostering synergy among fields and promoting the swift expansion of scientific understanding.

Identify Us

Our focus is on fostering synergy among fields and promoting the swift expansion of scientific understanding.

IgMin Corporation

Welcome to IgMin, a leading platform dedicated to enhancing knowledge dissemination and professional growth across multiple fields of science, technology, and the humanities. We believe in the power of open access, collaboration, and innovation. Our goal is to provide individuals and organizations with the tools they need to succeed in the global knowledge economy.

Publications Support
[email protected]
E-Books Support
[email protected]
Webinars & Conferences Support
[email protected]
Content Writing Support
[email protected]
IT Support
[email protected]

Search

Explore Section

Content for the explore section slider goes here.

Abstract

Abstract at IgMin Research

Our focus is on fostering synergy among fields and promoting the swift expansion of scientific understanding.

Engineering Group Research Article Article ID: igmin197

Enhancing Material Property Predictions through Optimized KNN Imputation and Deep Neural Network Modeling

Materials Science Data EngineeringMachine Learning DOI10.61927/igmin197 Affiliation

Affiliation

    Murad Ali Khan, Department of Computer Engineering, Jeju National University, Jeju 63243, Republic of Korea, Email: [email protected]

5.4k
VIEWS
1.2k
DOWNLOADS
Connect with Us

Abstract

In materials science, the integrity and completeness of datasets are critical for robust predictive modeling. Unfortunately, material datasets frequently contain missing values due to factors such as measurement errors, data non-availability, or experimental limitations, which can significantly undermine the accuracy of property predictions. To tackle this challenge, we introduce an optimized K-Nearest Neighbors (KNN) imputation method, augmented with Deep Neural Network (DNN) modeling, to enhance the accuracy of predicting material properties. Our study compares the performance of our Enhanced KNN method against traditional imputation techniques—mean imputation and Multiple Imputation by Chained Equations (MICE). The results indicate that our Enhanced KNN method achieves a superior R² score of 0.973, which represents a significant improvement of 0.227 over Mean imputation, 0.141 over MICE, and 0.044 over KNN imputation. This enhancement not only boosts the data integrity but also preserves the statistical characteristics essential for reliable predictions in materials science.

Figures

References

    1. Emmanuel T. A survey on missing data in machine learning. Journal of Big Data. 2021; 8: 1-37.
    2. Lee KJ, Tilling KM, Cornish RP, Little RJA, Bell ML, Goetghebeur E, Hogan JW, Carpenter JR; STRATOS initiative. Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework. J Clin Epidemiol. 2021 Jun;134:79-88. doi: 10.1016/j.jclinepi.2021.01.008. Epub 2021 Feb 2. PMID: 33539930; PMCID: PMC8168830.
    3. Saeipourdizaj P, Sarbakhsh P, Gholampour A. Application of imputation methods for missing values of PM10 and O3 data: Interpolation, moving average and K-nearest neighbor methods. Environ Health Eng Manage J. 2021;8(3):215-226.
    4. Abidin NZ, Ismail AR. An improved K-nearest neighbour with grasshopper optimization algorithm for imputation of missing data. Int J Adv Intell Informatics. 2021; 7(3).
    5. Xie Q. Online prediction of mechanical properties of hot rolled steel plate using machine learning. Mater Des. 2021; 197:109201.
    6. Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers. 2021 Aug;25(3):1315-1360. doi: 10.1007/s11030-021-10217-3. Epub 2021 Apr 12. PMID: 33844136; PMCID: PMC8040371.
    7. Peng D. RESI: a region-splitting imputation method for different types of missing data. Expert Syst Appl. 2021; 168:114425.
    8. Adhikari D. A comprehensive survey on imputation of missing data in internet of things. ACM Comput Surveys. 2022; 55(7):1-38.
    9. Alnowaiser K. Improving Healthcare Prediction of Diabetic Patients Using KNN Imputed Features and Tri-Ensemble Model. IEEE Access. 2024.
    10. Bertsimas D, Pawlowski C, Zhuo YD. From predictive methods to missing data imputation: an optimization approach. J Mach Learn Res. 2018; 18(196):1-39.
    11. Khan MA. An optimized ensemble prediction model using AutoML based on soft voting classifier for network intrusion detection. J Netw Comput Appl. 2023; 212:103560.
    12. Jäger S, Allhorn A, Bießmann F. A benchmark for data imputation methods. Front Big Data. 2021; 4:693674.
    13. Gad AM, Abdelkhalek RHM. Imputation methods for longitudinal data: A comparative study. Int J Stat Distr Appl. 2017; 3(4):72.
    14. Van Buuren S. Flexible imputation of missing data. CRC Press; 2018.
    15. Chen S, Haziza D. Recent developments in dealing with item non-response in surveys: A critical review. Int Stat Rev. 2019; 87(S192-S218).
    16. Van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate imputation by chained equations in R. J Stat Softw. 2011; 45:1-67.
    17. Troyanskaya O. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001; 17(6):520-525.
    18. Batista GEAPA, Monard MC. An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell. 2003; 17(5-6):519-533.
    19. Keerin P, Boongoen T. Improved knn imputation for missing values in gene expression data. Comput Mater Continua. 2021; 70(2):4009-4025.
    20. Chang Z. Neural Embeddings for kNN Search in Biological Sequence. Proc AAAI Conf Artif Intell. 2024; 38(1).
    21. Di Gesu V, Lo Bosco G, Pinello L. A one class KNN for signal identification: a biological case study. Int J Knowl Eng Soft Data Paradigms. 2009; 1(4):376-389.
    22. Khan MA. Enhanced abnormal data detection hybrid strategy based on heuristic and stochastic approaches for efficient patients rehabilitation. Future Gener Comput Syst. 2024; 154:101-122.
    23. Triguero I. Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data. Wiley Interdiscip Rev Data Min Knowl Discov. 2019; 9(2)
    24. Li D, Gu H, Zhang L. A hybrid genetic algorithm–fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals. Soft Comput. 2013; 17:1787-1796.
    25. Petrazzini BO. Evaluation of different approaches for missing data imputation on features associated to genomic data. BioData Min. 2021; 14:1-13.
    26. Nadimi-Shahraki MH. A hybrid imputation method for multi-pattern missing data: A case study on type II diabetes diagnosis. Electronics. 2021; 10(24):3167.
    27. Xiang G. Research on Predicting the Bending Strength of Ceramic Matrix Composites with Process of Incomplete Data. Int J Mach Learn Comput. 2021; 11(3).
    28. Han W. Prediction of flowability and strength in controlled low-strength material through regression and oversampling algorithm with deep neural network. Case Stud Constr Mater. 2024; 20.
    29. Lyngdoh GA. Prediction of concrete strengths enabled by missing data imputation and interpretable machine learning. Cem Concr Compos. 2022; 128:104414.
    30. Karamti H, Alharthi R, Anizi AA, Alhebshi RM, Eshmawi AA, Alsubai S, Umer M. Improving Prediction of Cervical Cancer Using KNN Imputed SMOTE Features and Multi-Model Ensemble Learning Approach. Cancers (Basel). 2023 Sep 4;15(17):4412. doi: 10.3390/cancers15174412. PMID: 37686692; PMCID: PMC10486648.
    31. Johnston J, Kistemaker G, Sullivan PG. Comparison of different imputation methods. Interbull Bull. 2011; 44.
    32. Khan SI, Hoque ASML. SICE: an improved missing data imputation technique. J Big Data. 2020;7(1):37. doi: 10.1186/s40537-020-00313-w. Epub 2020 Jun 12. PMID: 32547903; PMCID: PMC7291187.
    33. Sanjar K. Missing data imputation for geolocation-based price prediction using KNN–MCF method. ISPRS Int J Geo-Inf. 2020; 9(4):227.
    34. Zhou X, Chai H, Zhao H, Luo CH, Yang Y. Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning-based neural network. Gigascience. 2020 Jul 1;9(7):giaa076. doi: 10.1093/gigascience/giaa076. PMID: 32649756; PMCID: PMC7350980.
    35. Smith JL, Wilson ML, Nilson SM, Rowan TN, Schnabel RD, Decker JE, Seabury CM. Genome-wide association and genotype by environment interactions for growth traits in U.S. Red Angus cattle. BMC Genomics. 2022 Jul 16;23(1):517. doi: 10.1186/s12864-022-08667-6. PMID: 35842584; PMCID: PMC9287884.
    36. Lee T, Shi D. A comparison of full information maximum likelihood and multiple imputation in structural equation modeling with missing data. Psychol Methods. 2021 Aug;26(4):466-485. doi: 10.1037/met0000381. Epub 2021 Jan 28. PMID: 33507765.
    37. Kumar N. A new approach of outlier-robust missing value imputation for metabolomics data analysis. Curr Bioinformatics. 2019; 14(1):43-52.

Similar Articles

Potentially Toxic Metals in Cucumber Cucumis sativus Collected from Peninsular Malaysia: A Human Health Risk Assessment
Chee Kong Yap, Rosimah Nulit, Aziran Yaacob, Zaieka Shamsudin, Meng Chuan Ong, Wan Mohd Syazwan, Hideo Okamura, Yoshifumi Horie, Chee Seng Leow, Ahmad Dwi Setyawan, Krishnan Kumar, Wan Hee Cheng and Kennedy Aaron Aguol
DOI10.61927/igmin200
Kinetic Study of the Removal of Reafix Yellow B8G Dye by Boiler Ash
Peterson Filisbino Prinz, Mariane Hawerroth, Liliane Schier de Lima and Juliana Martins Teixeira de Abreu Pietrobelli
DOI10.61927/igmin127