Comparison of K-Nearest Neighbor and Naive Bayes Algorithms for Tuberculosis Diagnosis Classification
Dedi Setiadi, Alfis Arif, Anik Oktaria
Journal of Artificial Intelligence and Software Engineering (J-AISE) · 2025-03
Abstract
Tuberculosis is an infectious disease caused by the bacteria mycobacterium tuberculosis. Tuberculosis is a serious global health problem and can cause death if not treated properly. At the Sidorejo Health Center, the current process of diagnosing patients uses several benchmarks of medical history obtained from patients regarding complaints, symptoms, and risk factors, while the results of the diagnosis calculation are not yet known. Comparison of the K-nearest neighbor and naïve bayes algorithms in classifying tuberculosis can provide input for the Sidorejo Health Center in seeing the accuracy of the diagnosis of tuberculosis, with medical information such as symptoms and medical history, where later patient data will be processed using the rapid miner application. The system development method used in this study is CRISP-DM, which consists of business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The testing method uses a confusion matrix to measure the accuracy of the algorithm model with the results being that the K-nearest neighbor algorithm produces a high accuracy of 98% while the naïve bayes algorithm is the lowest with an accuracy of 0.70%.
MeSH terms
- Naive Bayes classifier
- k-nearest neighbors algorithm
- Bayes' theorem
- Artificial intelligence
- Pattern recognition (psychology)
- Tuberculosis
- Algorithm
- Computer science