About the Research

Understanding the science behind the system

A Hybrid GBDT Model for Advanced Diabetes Risk Prediction

Tushar Gupta, Manjari Gupta, Kunal Kaushik, Sanjana Jain — Raj Kumar Goel Institute of Technology

Abstract

Diabetes mellitus is one of the most prevalent chronic diseases worldwide, and early detection is critical for effective management and prevention of severe complications. This research proposes a novel Hybrid RF-GBDT (Random Forest – Gradient Boosted Decision Trees) ensemble model that combines the bagging strength of Random Forest with the boosting power of gradient-boosted trees to achieve superior prediction accuracy.

The study evaluates seven machine-learning classifiers — Logistic Regression, Decision Tree, Random Forest, XGBoost, LightGBM, CatBoost, and the proposed Hybrid RF-GBDT — on the PIMA Indians Diabetes Dataset containing 768 patient records with 8 clinical features.

Experimental results demonstrate that the Hybrid RF-GBDT model achieves competitive accuracy and the highest F1 score, validating the effectiveness of ensemble hybridization for clinical decision support systems.

Methodology
Data Collection
Preprocessing
Feature Extraction
Model Training
Evaluation
Prediction

Project Guide

Gyanender Kumar
Project Guide

Assistant Professor

Department of CSE (Data Science)
Raj Kumar Goel Institute of Technology

Research Team

Tushar Gupta
Researcher

Department of CSE (Data Science)

Manjari Gupta
Researcher

Department of CSE (Data Science)

Kunal Kaushik
Researcher

Department of CSE (Data Science)

Sanjana Jain
Researcher

Department of CSE (Data Science)

Technology Stack
Python Flask scikit-learn XGBoost LightGBM CatBoost NumPy Pandas Matplotlib Seaborn Plotly.js Bootstrap 5 Jinja2 HTML5 CSS3 JavaScript

Ready to Explore the System?

Try the diabetes risk prediction or explore the model comparison dashboard