Papers
Topics
Authors
Recent
2000 character limit reached

Exploring the Efficacy of Modified Transfer Learning in Identifying Parkinson's Disease Through Drawn Image Patterns

Published 6 Oct 2025 in cs.CV | (2510.05015v1)

Abstract: Parkinson's disease (PD) is a progressive neurodegenerative condition characterized by the death of dopaminergic neurons, leading to various movement disorder symptoms. Early diagnosis of PD is crucial to prevent adverse effects, yet traditional diagnostic methods are often cumbersome and costly. In this study, a machine learning-based approach is proposed using hand-drawn spiral and wave images as potential biomarkers for PD detection. Our methodology leverages convolutional neural networks (CNNs), transfer learning, and attention mechanisms to improve model performance and resilience against overfitting. To enhance the diversity and richness of both spiral and wave categories, the training dataset undergoes augmentation to increase the number of images. The proposed architecture comprises three phases: utilizing pre-trained CNNs, incorporating custom convolutional layers, and ensemble voting. Employing hard voting further enhances performance by aggregating predictions from multiple models. Experimental results show promising accuracy rates. For spiral images, weighted average precision, recall, and F1-score are 90%, and for wave images, they are 96.67%. After combining the predictions through ensemble hard voting, the overall accuracy is 93.3%. These findings underscore the potential of machine learning in early PD diagnosis, offering a non-invasive and cost-effective solution to improve patient outcomes.

Summary

  • The paper demonstrates that integrating modified transfer learning with custom convolutional and attention layers improves early Parkinson’s disease detection through drawn image patterns.
  • It employs an ensemble hard voting strategy to combine spiral and wave model predictions, enhancing overall classification accuracy and reducing model bias.
  • The approach shows competitive results relative to previous studies, offering a scalable, cost-effective, and non-invasive diagnostic tool.

Modified Transfer Learning for Parkinson's Disease Detection from Drawn Image Patterns

Introduction

This paper presents a comprehensive study on the application of modified transfer learning techniques for the early detection of Parkinson's Disease (PD) using hand-drawn spiral and wave images as diagnostic biomarkers. The approach integrates pre-trained convolutional neural networks (CNNs), custom convolutional layers, and attention mechanisms, culminating in an ensemble voting strategy to enhance classification robustness. The methodology is motivated by the need for non-invasive, cost-effective, and scalable diagnostic tools, given the limitations of traditional clinical assessments and the prevalence of PD in the aging population. Figure 1

Figure 1

Figure 1

Figure 1

Figure 1: Spiral and wave drawings from healthy individuals and PD patients, illustrating the visual differences leveraged for classification.

Dataset and Preprocessing

The dataset, sourced from Kaggle and originally introduced by Zham et al., consists of balanced sets of spiral and wave drawings from both healthy controls and PD patients. Each category contains 36 training and 15 testing images, ensuring class balance. Preprocessing involves Otsu's thresholding for binarization and resizing to 224×224224 \times 224 pixels, standardizing the input for CNN architectures. Figure 2

Figure 2

Figure 2: Representative preprocessed spiral and wave images after binarization and resizing.

To address the limited dataset size and enhance generalization, extensive data augmentation is performed. Augmentation parameters include rotation, zoom, width/height shifts, and shear, tailored separately for spiral and wave images. This process significantly increases the diversity and complexity of the training set. Figure 3

Figure 3: Distribution of images across categories after augmentation, demonstrating the expanded dataset size.

Model Architecture

The proposed architecture is a three-phase pipeline:

  1. Transfer Learning Backbone: Pre-trained VGG16 and VGG19 models (initialized with ImageNet weights) serve as feature extractors. VGG16 is used for spiral images, while VGG19 is selected for wave images, reflecting the distinct structural characteristics of each drawing type.
  2. Custom Convolutional and Attention Layers: Additional convolutional layers are appended to the backbone to capture domain-specific features. A spatial attention mechanism is integrated to dynamically reweight feature maps, focusing the model on diagnostically relevant regions.
  3. Ensemble Hard Voting: Predictions from the spiral and wave models are aggregated using hard voting, improving overall classification reliability and mitigating individual model biases. Figure 4

    Figure 4: Schematic of the proposed model architecture, illustrating the integration of transfer learning, custom layers, attention, and ensemble voting.

Training and Optimization

The models are trained using the Adam optimizer, with learning rates of $0.0005$ (spiral) and $0.0001$ (wave), and categorical cross-entropy loss. Training is conducted for 150 epochs with a batch size of 32. Dropout and attention mechanisms are employed to counteract overfitting, and model checkpoints based on validation loss ensure optimal model selection. Figure 5

Figure 5: Accuracy and loss curves for spiral drawings, indicating stable convergence and effective regularization.

Figure 6

Figure 6: Accuracy and loss curves for wave drawings, showing greater variability but effective model selection via checkpointing.

Experimental Results

Individual Model Performance

  • Spiral Model: Achieves 90% accuracy, with class-wise precision of 88% (healthy) and 93% (PD), and F1-scores of 90% for both classes.
  • Wave Model: Achieves 96.67% accuracy, with class-wise precision of 94% (healthy) and 100% (PD), and F1-scores of 97% for both classes.

Ensemble Performance

The ensemble model, combining spiral and wave predictions via hard voting, attains an overall accuracy of 93.3%. Precision is 91% (healthy) and 96% (PD), with recall values of 97% and 90%, respectively, and an F1-score of 94%. Notably, after post-hoc analysis and correction of misclassifications, a final accuracy of 98% is reported, with only one false positive and no false negatives. Figure 7

Figure 7: Confusion matrix for the ensemble model, highlighting the distribution of true positives, false positives, and false negatives.

Comparative Analysis

The proposed approach is benchmarked against prior works:

  • Chakraborty et al. (2020): 93.3% accuracy, but with overfitting and unstable training.
  • Islam et al. (2021): 96.64% accuracy, but with test set augmentation, potentially inflating results.
  • Proposed model: 93.3% accuracy on unaugmented test data, with improved training stability and reduced overfitting.

The use of attention mechanisms and ensemble voting distinguishes this approach, yielding robust performance without compromising evaluation integrity.

Theoretical and Practical Implications

The integration of transfer learning, custom convolutional layers, and attention mechanisms demonstrates the efficacy of leveraging hierarchical and context-aware feature extraction for medical image classification, even with limited data. The ensemble strategy further enhances reliability, a critical requirement in clinical applications. The methodology is computationally efficient, scalable, and adaptable to other diagnostic tasks involving structured image patterns.

Theoretically, the results support the hypothesis that spatial attention can compensate for limited data by focusing model capacity on salient regions, and that ensemble methods can mitigate individual model weaknesses. The strong numerical results—particularly the high F1-scores and low false negative rate—underscore the clinical viability of the approach.

Future Directions

Future work should focus on expanding the dataset to improve generalizability, exploring alternative backbone architectures (e.g., EfficientNet, Vision Transformers), and integrating multimodal data (e.g., pen pressure, temporal dynamics) for richer feature representation. Additionally, explainability methods could be incorporated to provide clinicians with interpretable model outputs, facilitating adoption in real-world diagnostic workflows.

Conclusion

This study presents a robust, non-invasive framework for early PD detection using hand-drawn image patterns, leveraging modified transfer learning, attention mechanisms, and ensemble voting. The approach achieves high accuracy and reliability on a balanced, unaugmented test set, demonstrating both practical utility and methodological rigor. The findings have significant implications for the development of accessible, AI-driven diagnostic tools in neurology, with potential for broader application across medical imaging domains.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.