Subjects Content

Welcome to IgMin Research - A BioMed & Engineering Open Access Journal, your gateway to a diverse world of scientific exploration and innovation. We proudly stand at the forefront of scholarly dissemination, bringing together the realms of Biology, Medicine and Engineering under a single umbrella. With a commitment to open access and knowledge democratization, we aim to empower researchers, scholars, and enthusiasts across the globe to explore, contribute, and collaborate.

Biology

Explore the intricate world of living organisms through disciplines such as Zoology, Histology, and Microbiology. Immerse yourself in the complexities of genomics and molecular biology, uncover the mysteries of taxonomic systems, and delve into the world of human biology. Venture into the realms of chemistry, from Organic Chemistry to Physical Chemistry, and explore the delicate balances of Earth's ecosystems through Atmospheric Science and Ecology....

Medicine

Discover the intricacies of the human body and its ailments through the prism of Medical Sciences. Journey through disciplines like Physiology, Pharmacology, and Anatomy, and explore the frontiers of Molecular Medicine and Immunology. Engage in the discourse on Clinical Trials and Health Economics, and unravel the complexities of Pain Management and Infectious Diseases.

Engineering

Immerse yourself in the realm of engineering marvels, from Control Engineering and Power Engineering to Materials Engineering and Mechanical Engineering. Uncover the mysteries of Signal Processing and delve into the precision of Instrumentation. Navigate the world of Automation and Artificial Intelligence, and witness the convergence of disciplines in Mechatronics Engineering and Biomedical Engineering.

General Science

Explore the complexities of the natural world through the lens of General Science. Delve into fields like Physics, Chemistry, Biology, and Earth Sciences, and examine cutting-edge topics in Environmental Science and Engineering. Engage in discussions on scientific innovations and the impact of research on society and health.

Members Content

We aim to facilitate the cross-pollination of ideas across disciplines to boost scientific progress.

Articles Content

We aim to facilitate the cross-pollination of ideas across disciplines to boost scientific progress.

Explore Content

We aim to facilitate the cross-pollination of ideas across disciplines to boost scientific progress.

Identify Us

We aim to facilitate the cross-pollination of ideas across disciplines to boost scientific progress.

Search

Explore Section

Content for the explore section slider goes here.

Abstract

Abstract at IgMin Research

We aim to facilitate the cross-pollination of ideas across disciplines to boost scientific progress.

Engineering Group Research Article Article ID: igmin197

Enhancing Material Property Predictions through Optimized KNN Imputation and Deep Neural Network Modeling

Materials Science Machine LearningMachine Learning Affiliation

Affiliation

    Department of Computer Engineering, Jeju National University, Jeju 63243, Republic of Korea

Abstract

In materials science, the integrity and completeness of datasets are critical for robust predictive modeling. Unfortunately, material datasets frequently contain missing values due to factors such as measurement errors, data non-availability, or experimental limitations, which can significantly undermine the accuracy of property predictions. To tackle this challenge, we introduce an optimized K-Nearest Neighbors (KNN) imputation method, augmented with Deep Neural Network (DNN) modeling, to enhance the accuracy of predicting material properties. Our study compares the performance of our Enhanced KNN method against traditional imputation techniques—mean imputation and Multiple Imputation by Chained Equations (MICE). The results indicate that our Enhanced KNN method achieves a superior R² score of 0.973, which represents a significant improvement of 0.227 over Mean imputation, 0.141 over MICE, and 0.044 over KNN imputation. This enhancement not only boosts the data integrity but also preserves the statistical characteristics essential for reliable predictions in materials science.

Figures

References

    1. Emmanuel T. A survey on missing data in machine learning. Journal of Big Data. 2021; 8: 1-37.
    2. Lee KJ, Tilling KM, Cornish RP, Little RJA, Bell ML, Goetghebeur E, Hogan JW, Carpenter JR; STRATOS initiative. Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework. J Clin Epidemiol. 2021 Jun;134:79-88. doi: 10.1016/j.jclinepi.2021.01.008. Epub 2021 Feb 2. PMID: 33539930; PMCID: PMC8168830.
    3. Saeipourdizaj P, Sarbakhsh P, Gholampour A. Application of imputation methods for missing values of PM10 and O3 data: Interpolation, moving average and K-nearest neighbor methods. Environ Health Eng Manage J. 2021;8(3):215-226.
    4. Abidin NZ, Ismail AR. An improved K-nearest neighbour with grasshopper optimization algorithm for imputation of missing data. Int J Adv Intell Informatics. 2021; 7(3).
    5. Xie Q. Online prediction of mechanical properties of hot rolled steel plate using machine learning. Mater Des. 2021; 197:109201.
    6. Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers. 2021 Aug;25(3):1315-1360. doi: 10.1007/s11030-021-10217-3. Epub 2021 Apr 12. PMID: 33844136; PMCID: PMC8040371.
    7. Peng D. RESI: a region-splitting imputation method for different types of missing data. Expert Syst Appl. 2021; 168:114425.
    8. Adhikari D. A comprehensive survey on imputation of missing data in internet of things. ACM Comput Surveys. 2022; 55(7):1-38.
    9. Alnowaiser K. Improving Healthcare Prediction of Diabetic Patients Using KNN Imputed Features and Tri-Ensemble Model. IEEE Access. 2024.
    10. Bertsimas D, Pawlowski C, Zhuo YD. From predictive methods to missing data imputation: an optimization approach. J Mach Learn Res. 2018; 18(196):1-39.
    11. Khan MA. An optimized ensemble prediction model using AutoML based on soft voting classifier for network intrusion detection. J Netw Comput Appl. 2023; 212:103560.
    12. Jäger S, Allhorn A, Bießmann F. A benchmark for data imputation methods. Front Big Data. 2021; 4:693674.
    13. Gad AM, Abdelkhalek RHM. Imputation methods for longitudinal data: A comparative study. Int J Stat Distr Appl. 2017; 3(4):72.
    14. Van Buuren S. Flexible imputation of missing data. CRC Press; 2018.
    15. Chen S, Haziza D. Recent developments in dealing with item non-response in surveys: A critical review. Int Stat Rev. 2019; 87(S192-S218).
    16. Van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate imputation by chained equations in R. J Stat Softw. 2011; 45:1-67.
    17. Troyanskaya O. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001; 17(6):520-525.
    18. Batista GEAPA, Monard MC. An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell. 2003; 17(5-6):519-533.
    19. Keerin P, Boongoen T. Improved knn imputation for missing values in gene expression data. Comput Mater Continua. 2021; 70(2):4009-4025.
    20. Chang Z. Neural Embeddings for kNN Search in Biological Sequence. Proc AAAI Conf Artif Intell. 2024; 38(1).
    21. Di Gesu V, Lo Bosco G, Pinello L. A one class KNN for signal identification: a biological case study. Int J Knowl Eng Soft Data Paradigms. 2009; 1(4):376-389.
    22. Khan MA. Enhanced abnormal data detection hybrid strategy based on heuristic and stochastic approaches for efficient patients rehabilitation. Future Gener Comput Syst. 2024; 154:101-122.
    23. Triguero I. Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data. Wiley Interdiscip Rev Data Min Knowl Discov. 2019; 9(2)
    24. Li D, Gu H, Zhang L. A hybrid genetic algorithm–fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals. Soft Comput. 2013; 17:1787-1796.
    25. Petrazzini BO. Evaluation of different approaches for missing data imputation on features associated to genomic data. BioData Min. 2021; 14:1-13.
    26. Nadimi-Shahraki MH. A hybrid imputation method for multi-pattern missing data: A case study on type II diabetes diagnosis. Electronics. 2021; 10(24):3167.
    27. Xiang G. Research on Predicting the Bending Strength of Ceramic Matrix Composites with Process of Incomplete Data. Int J Mach Learn Comput. 2021; 11(3).
    28. Han W. Prediction of flowability and strength in controlled low-strength material through regression and oversampling algorithm with deep neural network. Case Stud Constr Mater. 2024; 20.
    29. Lyngdoh GA. Prediction of concrete strengths enabled by missing data imputation and interpretable machine learning. Cem Concr Compos. 2022; 128:104414.
    30. Karamti H, Alharthi R, Anizi AA, Alhebshi RM, Eshmawi AA, Alsubai S, Umer M. Improving Prediction of Cervical Cancer Using KNN Imputed SMOTE Features and Multi-Model Ensemble Learning Approach. Cancers (Basel). 2023 Sep 4;15(17):4412. doi: 10.3390/cancers15174412. PMID: 37686692; PMCID: PMC10486648.
    31. Johnston J, Kistemaker G, Sullivan PG. Comparison of different imputation methods. Interbull Bull. 2011; 44.
    32. Khan SI, Hoque ASML. SICE: an improved missing data imputation technique. J Big Data. 2020;7(1):37. doi: 10.1186/s40537-020-00313-w. Epub 2020 Jun 12. PMID: 32547903; PMCID: PMC7291187.
    33. Sanjar K. Missing data imputation for geolocation-based price prediction using KNN–MCF method. ISPRS Int J Geo-Inf. 2020; 9(4):227.
    34. Zhou X, Chai H, Zhao H, Luo CH, Yang Y. Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning-based neural network. Gigascience. 2020 Jul 1;9(7):giaa076. doi: 10.1093/gigascience/giaa076. PMID: 32649756; PMCID: PMC7350980.
    35. Smith JL, Wilson ML, Nilson SM, Rowan TN, Schnabel RD, Decker JE, Seabury CM. Genome-wide association and genotype by environment interactions for growth traits in U.S. Red Angus cattle. BMC Genomics. 2022 Jul 16;23(1):517. doi: 10.1186/s12864-022-08667-6. PMID: 35842584; PMCID: PMC9287884.
    36. Lee T, Shi D. A comparison of full information maximum likelihood and multiple imputation in structural equation modeling with missing data. Psychol Methods. 2021 Aug;26(4):466-485. doi: 10.1037/met0000381. Epub 2021 Jan 28. PMID: 33507765.
    37. Kumar N. A new approach of outlier-robust missing value imputation for metabolomics data analysis. Curr Bioinformatics. 2019; 14(1):43-52.

Similar Articles

Qualitative Model of Electrical Conductivity of Irradiated Semiconductor
Temur Pagava, Levan Chkhartishvili, Manana Beridze, Darejan Khocholava, Marina Shogiradze and Ramaz Esiava
DOI10.61927/igmin166

Social Icons

PUBLISH YOUR RESEARCH

We publish a wide range of article types in biology, medicine and engineering with no editorial biases.

Submit

See Manuscript Guidelines and APC

Explore the IgMin Subjects
Google Scholar
welcome Image

Google Scholar, beta-launched in November 2004, acts as an academic navigator through vast scholarly seas. It covers peer-reviewed journals, books, conference papers, theses, dissertations, preprints, abstracts, technical reports, court opinions, and patents. Search IgMin Articles