Area: Deep learning, Computer Vision
Our AI Centre is equipped with advanced NVIDIA DGX2 systems, with A100 GPU cards enabling high-performance computing, artificial intelligence research, and large-scale model training within a secure, reliable, and scalable infrastructure.
Area: Deep learning, Computer Vision
Area: Deep Learning and Medical Image Analysis
Area: Deep Learning, Brain Computer Interface
AREA: Deep learning, Computer Vision
Key Highlights:
The rising prevalence of lumbar spine disorders demands scalable solutions for mass screening and automated diagnosis. Accurate analysis of specific MRI slices, such as mid-sagittal or transverse mid-height intervertebral disc (IVD) slices, is essential but currently relies on time-consuming, error-prone manual selection. Automating this process is crucial to enhance the efficiency and accuracy of computer-aided diagnostic systems. To address this need, this study introduces a novel deep learning-based framework—SpineDeep-Net that integrates self-attention mechanisms within a multi-layer convolutional neural network for automatic selection of optimal transverse planes of lumbar spine MRI disc slices. By focusing on mid-height slices of L3/L4, L4/L5, and L5/S1 IVDs—the most diagnostically relevant slices, SpineDeep-Net eliminates the reliance on manual selection processes, thereby accelerating and improving the diagnostic pipeline. Unlike standard attention, the proposed dual-self-attention employs two sequential attention stages that jointly enhance long-range spatial cue extraction and emphasize subtle disc-level differences. This mechanism enables the model to focus more effectively on diagnostically relevant regions within lumbar MRI slices by dynamically recalibrating feature maps and strengthening feature dependencies. Experimental evaluations demonstrate the superior performance of SpineDeep-Net, achieving 96.83% accuracy and 98.41% specificity, outperforming state-of-the-art methods. By automating the selection and classification of clinically critical disc slices, SpineDeep-Net addresses a key challenge in lumbar spine diagnostics, providing a reliable, scalable, and efficient tool that aids radiologists in making informed clinical decisions. The proposed framework highlights the transformative potential of self-attention-guided deep learning in advancing healthcare diagnostics.
Relevant Publication:
Rashmi Singh, Rakesh Chandra Joshi, Suzain Rashid, Radim Burget and Malay Kishore Dutta
SpineDeep-Net: Dual-Self-Attention-Based Deep Neural Network for Automating Slice Selection and Precise Transverse Plane Localization in Lumbar Spine MRI for Intervertebral Disc Analysis" International Journal of Imaging Systems and Technology, 2025. Wiley Publishers, SCI indexed Impact Factor 2.5.
AREA: Deep Learning and Medical Image Analysis
Key Highlights:
Relevant Publication:
Anjali Singh, Parth Mani Sharma, Abhishek Kaushal, Malay Kishore Dutta, “TinyEyeNet: An Efficient CNN for Classifying Anterior Segment Eye Conditions” 5th International Conference on Advanced Network Technologies and Intelligent Computing. Publisher: CCIS, Springer Nature Publishers.
AREA: Deep Learning, Brain Computer Interface
Key Highlights :
Developed a novel BMFCNet architecture integrating blended multi-level feature extraction for robust detection of Major Depressive Disorder (MDD) from EEG signals. Introduced a Residual-Inception module to effectively capture both low-level (LL) and high-level (HL) discriminative EEG features, enhancing representational capacity. Developed a Constraint Fusion mechanism for adaptive weighting and fusion of LL and HL features, improving feature integration and classification performance. Addressed subjectivity in MDD diagnosis by providing an automated, EEG-based framework that enhances accuracy, reliability, and clinical applicability. Validated the proposed model on benchmark datasets, demonstrating superior performance compared to 16 state-of-the-art methods in terms of accuracy and efficiency.
Relevant Publication:
Mohan Karnati, Geet Sahu, Gautam Verma, Ayan Seal, Malay Kishore Dutta, Joanna Jaworek-Korjakowska. "BMFCNet: Blended Multi-Level Features with Constraint Fusion Network for Depression Detection from EEG Signals", in IEEE Transactions on Instrumentation and Measurement, vol. 74, pp. 1-14, 2025, Art no. 2511414, doi: 10.1109/TIM.2025.3545204, SCI Indexed Impact Factor : 5..6.
At Amity, innovation thrives at the intersection of AI and real-world challenges. Our students are pioneering advancements in various AI areas like deep learning, computer vision, neural networks, and more to drive impactful change. From detecting neurological disorders to enhancing medical imaging, these AI-driven projects showcase their dedication to technological excellence and societal progress.
Computer Vision, Attention Models
Developed an AI-assisted otoscope framework that uses deep learning to automatically diagnose common ear diseases from otoscopic images with high accuracy. The system enables early detection and reduces dependence on specialist expertise, supporting real-time, portable screening in primary healthcare settings. Read More...
Artificial Intelligence
Satellite imagery is essential for applications such as environmental monitoring, disaster management, urban planning, and precision agriculture. Image quality is often limited by sensor constraints, low revisit frequency, and high acquisition costs, resulting in reduced spatial resolution. A novel super-resolution framework, RCAN-RS, is proposed, extending the Residual Channel Attention Network (RCAN). The model incorporates domain-specific enhancements for remote sensing imagery: Dual-pooling–based channel attention for improved feature representation Spectral attention module to preserve band-specific information Edge enhancement unit to maintain sharp boundaries and fine structural details The integration of attention mechanisms and edge-aware processing significantly enhances reconstruction quality. The proposed approach is highly suitable for improving satellite imagery in remote sensing applications. Read More...
Deep Learning, Computer Vision, Attention Models
Developed SpineDeep-Net, a novel deep learning framework integrating self-attention with multi-layer CNNs for automated selection of diagnostically relevant lumbar MRI disc slices. Eliminated manual slice selection by automatically identifying optimal transverse mid-height slices of L3/L4, L4/L5, and L5/S1 IVDs, improving efficiency and reducing human error. Introduced a dual self-attention mechanism, enabling enhanced long-range spatial dependency modeling and precise localization of subtle intervertebral disc variations. Enhanced feature representation through dynamic feature recalibration and strengthened inter-feature dependencies, improving discriminative learning for lumbar spine analysis. Demonstrated superior performance on experimental evaluation, achieving 96.83% accuracy and 98.41% specificity, outperforming state-of-the-art methods and supporting scalable clinical deployment. Read More...
Graph Convolutional Network
Developed RGTFormer, a novel hybrid deep learning architecture integrating a categorical gated transformer with a Relational Graph Convolutional Network (RGCN) for mutation-driven drug resistance prediction in TB. Leveraged multi-modal mutation features, incorporating both genomic sequence and structural characteristics from six key drug-resistance genes to enhance predictive capability. Modeled inter-mutation dependencies using RGCN, enabling effective capture of relational information among mutations, while the transformer captures complex feature interactions. Demonstrated superior predictive performance, achieving 98.67% test accuracy and 97.15% cross-validation accuracy, outperforming conventional ML and DL baselines. Validated model effectiveness through ablation studies, highlighting the contribution of gated attention and graph-based learning, and providing an interpretable framework for clinically relevant resistance prediction and personalized TB treatment. Read More...
Deep Learning, Brain Computer Interface
Developed a novel BMFCNet architecture integrating blended multi-level feature extraction for robust detection of Major Depressive Disorder (MDD) from EEG signals. Introduced a Residual-Inception module to effectively capture both low-level (LL) and high-level (HL) discriminative EEG features, enhancing representational capacity. Developed a Constraint Fusion mechanism for adaptive weighting and fusion of LL and HL features, improving feature integration and classification performance. Addressed subjectivity in MDD diagnosis by providing an automated, EEG-based framework that enhances accuracy, reliability, and clinical applicability. Validated the proposed model on benchmark datasets, demonstrating superior performance compared to 16 state-of-the-art methods in terms of accuracy and efficiency. Read More...
Deep Learning and Computer Vision
• Proposes GAT-TNBC-Net, a dual-branch attention-based model for TNBC subtype classification. • Combines Lightweight Attentive Head Network (LAHN) and Graph Attention Network (GAT). • Captures both global gene dependencies and structural gene relationships. • Uses multi-head attention fusion for integrating tabular and graph features. • Works on high-dimensional RNA-seq data (55,662 genes per sample). • Achieves 97.22% test accuracy and 92.22% cross-validation accuracy. • Supports personalized treatment planning with interpretable predictions. Read More...
Deep Learning and Computer Vision
• Proposes SARANet, a self-attentive multi-scale residual attention network for Braille recognition. • Combines R-CBAM (channel & spatial attention) and Compact Self-Attention (CSA). • Captures fine-grained details and long-range dependencies in Braille images. • Works on Arabic Braille dataset with 32 character classes. • Achieves 95.29% accuracy and F1-score of 0.952. • Improves optical Braille recognition (OBR) performance. • Supports assistive technology for visually impaired individuals.. Read More...
Deep Learning and Computer Vision
• Proposes an AI-assisted otoscope framework for automated ear disease diagnosis • Enables early and accurate detection of common ear conditions from otoscopic images • Utilizes a deep learning–based image analysis pipeline for robust classification • Reduces dependency on specialist expertise, supporting primary healthcare screening • Designed for real-time and portable diagnostic applications • Demonstrates high classification accuracy on clinically relevant datasets • Enhances accessible, scalable, and cost-effective ear healthcare solutions. Read More...
Deep Learning and Medical Image Analysis
TinyEyeNet introduces a lightweight CNN for accurate anterior segment eye disease classification Designed for high performance with low computational cost, ideal for real-world deployment Trained and validated on a custom-curated clinical eye image dataset Achieves strong diagnostic accuracy, outperforming conventional deep models Suitable for resource-constrained and portable ophthalmic screening systems Enables faster, scalable, and accessible eye disease detection Read More...
Deep Learning, Convolutional Neural Network (CNN) and Computer Vision
• Lightweight multi-attention CNN designed for real-time knot classification. • Achieves 95.16% test accuracy, outperforming all benchmark models. • Uses Ghost, SE, and CBAM modules with stochastic depth for efficiency. • Computationally efficient at 1.03 GFLOPs with 59 ms inference time. • Grad-CAM explainability highlights knot-specific visual cues for trust and transparency. • Robust performance across lighting, background, and rope tension variations. Read More...
Deep Learning & AI in Healthcare
• Introduces the RAIE Transformer, a lightweight, recency-augmented architecture designed specifically for accurate cricket ball trajectory prediction using short input sequences. • Uses a curated dataset of 4,400 frames from 212 cricket deliveries, manually annotated and preprocessed for reliable spatio-temporal modelling. • Incorporates a novel Recency-Augmented Input Embedding (RAIE) that prioritizes recent motion cues while preserving long-term context to improve prediction accuracy. • Outperforms traditional models such as LSTM, RNN, Base Transformer, Informer, TimeMixer, and Graph Transformer—achieving MSE 3.95, ADE 2.35, and FDE 2.56, the best among all tested models. • Provides a fully automated end-to-end pipeline combining YOLO-based detection, interpolation, and autoregressive prediction—eliminating manual annotation requirements. • Demonstrates strong real-time potential, handling non-linear motion (swing, spin, bounce) with high efficiency and low computational cost, suitable for sports analytics and broadcast applications Read More...
Deep Learning and Computer Vision
• Next-Gen AI Safety: SalienNext uses an upgraded ConvNeXt + CBAM attention design to spot drowsy drivers with far higher accuracy than traditional vision models. • Ultra-Sharp Eye Detection: Multi-stage attention zooms into subtle eye and facial cues, enabling the system to detect even minor signs of fatigue. • State-of-the-Art Accuracy: Achieves an impressive 99% accuracy on the MRL Eye Dataset, outperforming leading deep learning models in the field. • Lightweight & Real-Time Ready: Optimized for speed and efficiency, making it deployable on edge devices inside vehicles. • Smart Attention Fusion: Combines low-level details and high-level semantics for a clearer understanding of drowsiness signals. • Designed for Safer Roads: Built to support next-gen ADAS and intelligent transportation systems, targeting reduced accidents and enhanced driver monitoring. Read More...
Transformer Models & Natural Language Processing
• MultiFlipFormer is a multimodal transformer designed for emotion flip reasoning and instigator detection in therapeutic conversations. • It models dynamic emotional transitions across dialogue turns using textual, visual, and contextual cues. • Trained on the MESC dataset with over 28,000 utterances covering 7 emotions and 10 therapy strategies. • Achieves a weighted F1-score of 0.828 and perfect instigator detection accuracy (1.000). • Outperforms existing models like TGIF and MPT-HCL, offering real-time therapeutic insights and emotion trajectory forecasting. • Future work focuses on explainability, few-shot learning, and multilingual adaptation for broader clinical use. Read More...
Natural Language Processing and Machine Learning
• Introduces a resource-efficient reinforcement learning framework using Group Relative Policy Optimization (GRPO) to enhance mathematical reasoning in small-scale language models. • Implements structured prompting with XML-style tags to clearly separate reasoning steps from final answers, improving interpretability and training alignment. • Integrates memory-efficient optimization techniques—including 8-bit AdamW, mixed precision, gradient checkpointing, and accelerated decoding—to enable single-GPU fine-tuning. • Achieves 50.95% accuracy on the GSM8K benchmark, outperforming larger models despite using only a 0.5B parameter model. • Demonstrates that compact LLMs can achieve competitive reasoning ability, making reinforcement learning viable in low-resource environments. Read More...
Deep Learning and Medical Image Analysis
• Proposes a Self-Attention U-Net integrated with Residual Bottleneck layers to generate high-quality MRI images from low-cost CT scans. • Utilizes self-attention mechanisms to capture long-range dependencies, enhancing anatomical and structural accuracy in medical image translation. • Incorporates a spectral-normalized Patch Discriminator to ensure realistic MRI-like outputs with improved perceptual quality. • Achieves strong results with PSNR up to 25.78 dB and SSIM up to 0.7125, outperforming baseline models like U-Net, Pix2Pix, and CycleGAN. • Offers a cost-effective solution for medical imaging in resource-limited regions, improving diagnostic accessibility and healthcare outcomes. Read More...
Computer Vision and Security / Multimedia Forensics
• Introduces a lightweight image splicing detection framework based on information-theoretic principles, avoiding the need for deep learning or GPUs. • Extracts 20 handcrafted features across spatial, cross-channel, and multi-scale domains using entropy, mutual information, and Kolmogorov complexity. • Achieves 85.32% accuracy, 0.934 AUC-ROC, and 0.8571 F1-score on the Columbia Image Splicing Dataset, rivaling deep models. • Provides interpretable results through feature importance analysis, identifying key tampering indicators like information variance and conditional entropy. • Demonstrates a computationally efficient, explainable, and generalizable solution for digital image forgery detection in forensic and media applications. Read More...
Machine Learning
• Proposes a deep learning-based intrusion detection framework combining Multi-Head Self-Attention (MHSA) and Residual Dense Blocks (RDBs) for fine-grained attack classification. • Effectively addresses class imbalance using a dynamic sampling strategy with SMOTE oversampling and random under-sampling. • Captures complex feature dependencies and ensures stable gradient flow for deep network training on imbalanced flow-based datasets. • Evaluated across four temporal windows (5s, 10s, 30s, 60s), achieving 99.89% peak accuracy on the 10-second window. • Demonstrates high recall on rare attack types (e.g., recon-dns, brute force-ftp), ensuring robust, fair, and adaptive real-time intrusion detection. Read More...
Computer Vision
• Proposes a Compact Convolutional Transformer (CCT) for colon cancer detection. • Combines CNN (local features) + Transformer (global attention) for better learning. • Works effectively on medical images with limited dataset size. • Includes data augmentation, patch tokenization, and transformer encoder layers. • Achieves 97.54% accuracy and 97.24% precision. • Outperforms CNN, ResNet, and Vision Transformer models. • Enhances diagnostic accuracy and supports improved patient outcomes. Read More...
Visual QA & RAG
• Focuses on automated semantic segmentation of land cover using high-resolution satellite imagery. • Proposes a deep learning model combining residual encoder, feature pyramid network, and U-Net decoder. • Achieves 55.14% mIoU and 84.4% pixel accuracy, outperforming previous approaches. • Utilizes a diverse dataset with various land types like forests, agriculture, urban, and barren areas. • Enables scalable and reliable land cover analysis for applications in urban planning, environment, and disaster response Read More...
Deep Learning, Remote Sensing, Segmentation
• Focuses on automated semantic segmentation of land cover using high-resolution satellite imagery. • Proposes a deep learning model combining residual encoder, feature pyramid network, and U-Net decoder. • Achieves 55.14% mIoU and 84.4% pixel accuracy, outperforming previous approaches. • Utilizes a diverse dataset with various land types like forests, agriculture, urban, and barren areas. • Enables scalable and reliable land cover analysis for applications in urban planning, environment, and disaster response. Read More...
Deep Learning
SignSpeakNet is a lightweight deep learning model designed for real-time sign language recognition, integrating spatial landmarks with a multi-head attention-guided Bi-LSTM network. It effectively captures spatial and temporal gesture patterns, achieving 96.39% accuracy on a custom 20-gesture dataset. The attention mechanism enhances focus on key frames, while the Bi-LSTM models bidirectional temporal context. Optimized for low-resource devices, SignSpeakNet offers a scalable and accessible solution for inclusive communication. Read More...
Biomedical Signal Processing, Deep Learning, Mental Health
The research presents a deep spectro-temporal learning framework for accurately detecting mental fatigue using multi-modal physiological signals such as EEG, EDA, heart rate, body temperature, and blood volume pulse. By extracting features through Mel-Frequency Cepstral Coefficients (MFCCs) and leveraging a hybrid 1D-CNN and BiLSTM architecture, the model effectively captures both local and temporal patterns within the data. Achieving a high classification accuracy of 91.65% on the MEFAR dataset, the framework outperforms traditional methods and demonstrates strong potential for real-time, scalable applications in healthcare, cognitive workload management, and occupational safety. Read More...
Biomedical Image Processing, Deep Learning, Convolutional Neural Networks
• Focuses on osteoporosis detection using dental periapical radiograph images. • Develops a custom deep learning model for multi-class classification (osteoporosis, osteopenia, normal). • Compares performance with pre-trained models (DenseNet121/201, VGG19, MobileNetV2). • Custom model achieves 95.17% accuracy and AUC of 99.59%. • Outperforms fine-tuned transfer learning models in both accuracy and AUC. • Enables early diagnosis and personalized treatment planning. • Shows strong potential for real-world clinical deployment and AI-based screening. Read More...
Computer Vision and Deep Learning
• Proposes MHA-Mobile, combining MobileNetV2 with multi-head attention. • Classifies biodegradable vs. non-biodegradable materials using images. • Achieves 95.19% accuracy in waste classification. • Deploys the model as a mobile app using Flutter and Dart. • Provides a lightweight and real-time solution for waste segregation. • Eliminates need for manual sorting, improving efficiency. • Supports environmental sustainability and smart waste management. Read More...
Deep Learning, Medical Imaging
AD Progression Mapping uses dilated convolutions in a lightweight CNN to classify six stages of Alzheimer’s with 91% accuracy. It enhances early detection by capturing subtle brain changes and automating MRI analysis for timely clinical intervention. Read More...
Deep Learning, Medical Imaging
DeepDementia, enhanced with CBAM, accurately classifies five stages of dementia from MRI scans, achieving 95.71% test accuracy. Its fine-grained detection and attention-driven insights enable earlier diagnosis and personalized care. Read More...
Deep Learning and Attention Network
• Proposes a low-cost AI-based sign language recognition system using video input. • Uses LSTM with attention mechanism to capture temporal gesture patterns. • Focuses on key frames for improved recognition efficiency. • Trained on a self-curated dataset of 14 essential gestures. • Achieves 92.1% accuracy across gesture classes. • Eliminates need for expensive sensors or custom hardware. • Enables accessible communication solutions for the deaf-mute community. Read More...
Large Language Models and RAG
• Proposes an AI framework integrating LLMs with RAG and vector databases for medical data handling. • Handles large-scale structured and unstructured healthcare data efficiently. • Addresses key LLM challenges like hallucination, outdated knowledge, and retrieval issues. • Utilizes advanced retrieval techniques to improve response accuracy. • Achieves strong performance (ROUGE-1 F1: 0.718, BLEU: 0.4709). • Generates accurate responses within ~35 seconds, ensuring efficiency. • Supports public health decision-making, disease surveillance, and clinical applications. Read More...
Computer Vision, Deep Learning, Natural Language Processing
Revolutionizing Diagnostics: Introduces a dual-mode framework combining Vision Transformer (ViT) models for high-precision image classification with Retrieval-Augmented Generation (RAG) for contextual insight generation. Enhanced Medical Interpretation: The RAG system bridges the gap between raw predictions and clinical understanding by retrieving relevant medical literature tied to classification outcomes. Exceptional Accuracy: Achieves 99.30% classification accuracy on a custom parasite and blood cell dataset, outperforming leading baseline models. Resource-Conscious Innovation: Tailored for use in low-resource medical settings, offering fast, accurate, and explainable diagnostic support. Clinical and Educational Utility: Enables early diagnosis and supports medical training by providing both image-based decisions and synthesized medical context. Read More...
Deep Learning and Computer Vision
• Hybrid AI model combining EfficientNet and Swin Transformer • 99% deepfake detection accuracy on StyleGAN-generated faces • Lightweight architecture with only ~4.25M parameters • Captures local + global facial artifacts using attention mechanisms • Fast inference, suitable for near real-time detection • Explainable AI with Grad-CAM visual insights • Designed for real-world cybersecurity and digital forensics Read More...
Computer Vision and Attention
Advanced Architecture: AMSFFNet integrates dilated convolutions and adaptive attention to capture fine-grained and contextual MRI features. High Diagnostic Accuracy: Achieves 97.14% and 95.00% accuracy on two public breast MRI datasets, outperforming DCNN, GoogleNet, and DMRBNet. Robust Clinical Performance: Demonstrates strong sensitivity, specificity, and AUC, confirming its reliability and generalizability. Early Detection Focus: Designed specifically for early-stage breast cancer recognition, enhancing clinical decision support through automation. Read More...
Computer Vision
• Assistive Real-Time Detection: BlindAssist uses enhanced YOLOv9s architecture to identify indoor obstacles and guide users via directional audio using the pyttsx3 library. • Innovative Module Variants: Introduces G-RepNCSPELAN, CBAM-RepNCSPELAN, and GCBAM-RepNCSPELAN to boost detection accuracy and speed. • Performance Boost: CBAM-RepNCSPELAN improves mAP50 by 8.09%, with 13.36 ms inference time, while other variants balance speed and complexity. • Statistically Validated: Achieved significant gains (p = 0.00026), confirming effectiveness in assistive object detection for visually impaired users. Read More...
BiLSTM & Attention
• Advanced Architecture: Introduces a stacked BiLSTM with multi-scale, multi-head attention to handle motion variability and sensor orientation. • Robust Performance: Achieves 95% accuracy across 2000 sessions, outperforming CNN+LSTM and standard BiLSTM models. • Rich Sensor Input: Leverages IMU data (accelerometer, gyroscope, magnetometer) from 200 participants performing 10 exercise types. • Real-Time Readiness: Demonstrates stable training, minimal overfitting, and precise classification—ideal for wearable health and fitness tech. Read More...
Transformer Model & LLMs
• Client-Side Deployment: Runs entirely in-browser via TensorFlow.js and WebGPU, ensuring privacy and low latency. • High Accuracy Detection: Achieves 98% accuracy and a 0.9979 ROC-AUC on phishing message classification. • Robust Transformer Backbone: Leverages a fine-tuned RoBERTa model trained on over 59,000 SMS, Telegram, and Enron spam samples. • Practical Browser Extension: No server-side dependency, enabling real-time phishing protection across messaging platforms. Read More...
GANs (Generative Adversarial Networks), LSTM (Long Short-Term Memory)
• Innovative Model Architecture: Combines Conditional GAN for synthetic data generation with CNN-LSTM for capturing spatial-temporal patterns. • High Accuracy Performance: Achieves 99.23% across all key metrics (accuracy, precision, recall, F1-score) using k-fold cross-validation. • Rich Multimodal Input: Utilizes diverse wearable data, including EDA, ECG, EMG, temperature, respiration, and accelerometer signals. • Real-Time Ready & Interpretable: Offers both robustness and interpretability, making it suitable for practical real-time stress monitoring applications. Read More...
Computer Vision and Deep Learning
• Applies 2D Fourier Transform to convert images into the frequency domain. • Combines spatial + frequency domain features for better analysis. • Uses CNN for classification of benign vs. malignant skin lesions. • Helps identify hidden patterns not visible in raw images. • Achieves 83% accuracy with AUC of 0.93. • Improves differentiation between skin cancer and similar-looking conditions. • Demonstrates effectiveness of hybrid feature representation approach. Read More...
Computer Vision and Deep Learning
• Proposes a feature fusion approach combining multiple DCNNs (EfficientNetB3, ResNet50, VGG16, ConvNeXtTiny, DenseNet121). • Extracts rich deep features from dermoscopic skin images. • Uses XGBoost for feature selection (K-Best = 1000) based on importance scores. • Performs classification of benign vs. malignant melanoma. • Achieves strong performance with AUC of 0.95. • Outperforms single-model (standalone DCNN) approaches. • Analyzes impact of number of selected features on performance. Read More...
Computer Vision and Deep Learning
• Uses MobileNetV2 (transfer learning) for pneumonia detection from chest X-rays. • Introduces hybrid loss (cross-entropy + focal loss) to handle class imbalance. • Applies oversampling to improve minority class learning. • Achieves 91% accuracy with AUC of 0.97. • Reduces dependency on manual diagnosis by experts. • Provides robust and efficient automated detection system. • Suitable for real-world clinical and healthcare applications. Read More...
Deep Learning, Computer Vision and Convolutional Neural Networks (CNNs)
• Proposes MobileInceptionNet, combining MobileNet (lightweight) and Inception (multi-scale feature extraction). • Designed for automatic classification of apple leaf diseases across 13 classes. • Handles variations in color, texture, and disease patterns effectively. • Achieves 94.99% accuracy with strong precision (0.947), recall (0.946), and F1-score (0.946). • Provides a computationally efficient model, suitable for mobile and edge devices. • Reduces reliance on manual inspection and expert knowledge. • Supports early disease detection, helping prevent crop loss. • Useful for smart agriculture and real-time monitoring systems. Read More...
Deep Learning and Computer Vision
• Proposes a multi-scale, multi-attention MobileNetV2 model for cervical cancer classification. • Enhances feature extraction across different resolutions using multi-scale learning. • Uses attention mechanisms to focus on important cellular features. • Works on Pap smear cell images for automated diagnosis. • Achieves 92.4% accuracy, outperforming existing approaches. • Offers a lightweight and efficient solution for clinical applications. Read More...
Computer Vision and Medical Imaging
• Proposes NestedVGG, a transfer learning-based model using VGG16 for feature extraction. • Combines outer (feature extraction) and inner (fine-grained classification) architectures. • Classifies glioma, meningioma, pituitary, and no-tumor from MRI scans. • Achieves high accuracy of 97.71%, outperforming existing methods. • Reduces manual effort and inter-expert variability in diagnosis. • Demonstrates robust and consistent performance across tumor types. Read More...
Computer Vision
• Focuses on early detection of Acute Ischemic Stroke (AIS) using AI. • Utilizes MRI imaging for identifying brain abnormalities. • Proposes an ensemble model combining ResNet50 and EfficientNetB0. • Incorporates a self-attention mechanism to enhance feature selection. • Effectively captures subtle and complex stroke patterns. • Reduces dependency on manual clinical diagnosis. • Improves accuracy and speed of detection. • Achieves 95% classification accuracy. • Demonstrates better performance than traditional methods. • Supports clinical decision-making and early intervention. Read More...
Deep Learning, Convolutional Neural Network (CNN) and Computer Vision
Hybrid AI Architecture: Introduces a novel fusion of CNN, Residual Networks, and Capsule Layers for accurate yoga pose estimation. Smarter Pose Understanding: Capsule layers preserve spatial relationships, enabling better recognition of complex and overlapping yoga postures. Training Made Efficient: Residual connections overcome vanishing gradient issues, allowing deeper learning with stable performance. High Accuracy, Fewer Parameters: Achieves an impressive 96.62% validation accuracy with significantly fewer trainable parameters than popular deep models. Robust Real-World Performance: Effectively handles pose variations, occlusions, lighting changes, and diverse backgrounds. Optimizer Comparison: Adam optimizer outperforms RMSprop and SGD, delivering the best classification results. Read More...
Computer Vision and Large Language Models.
• AI for Plant Health: This project uses a combination of image recognition (CNN) and language understanding (LLM) to help farmers identify plant diseases. • Diverse Plant Diseases: The system can recognize 48 different plant diseases, including healthy plants, based on images. • Easy-to-Use: A user-friendly web application makes it simple for farmers to use the system. • Accurate Predictions: The AI model is highly accurate, correctly identifying diseases in most cases. • Helpful Advice: The LLM provides farmers with real-time information and advice on how to treat the identified diseases. Read More...
Large Language Models (LLMs)
• AI that "Sees" and "Reads": This project combines a powerful AI model (Gemini 1.5 Flash) with a super-fast search engine (FAISS) to analyze both text and images within PDF documents. • Smart Searching: The system can quickly find the most relevant pieces of text and images based on user queries. • Comprehensive Answers: The AI then uses the found information to generate comprehensive and informative answers, combining insights from both text and images. • Efficient Processing: The system is designed to work efficiently even with large amounts of data. • Versatile Tool: This powerful approach has many applications, especially in fields that require analyzing information from various sources like text and images. Read More...
Time Series Analysis & Deep Learning
• AI for Clean Ganga: This research uses advanced AI (LSTM) to predict water quality in the Ganga River. • Accurate Predictions: The system accurately forecasts key water quality indicators like pH, DO, and BOD with 96.86% accuracy. • Early Warnings: It provides early warnings of potential water pollution issues, allowing for timely action. • Adaptable System: The system can adjust to changing conditions like seasons, pollution levels, and industrial discharges. • Scalable Solution: This approach can be used to improve water quality in other rivers facing pollution problems. Read More...
Large Language Models (LLMs)
• AI Detects Fake CT Scans: This project uses a sophisticated AI model (DeepMedFuseX) to identify fake CT scans (deepfakes) created using artificial intelligence. • Focus on Details: The AI model uses a special attention mechanism (CBAM) to focus on the most important details in the CT scans, improving its ability to detect fakes. • High Accuracy & Reliability: The model achieves high accuracy and reliability in identifying fake CT scans, helping to ensure the accuracy of medical diagnoses. • Understanding the AI: The model also explains its reasoning, highlighting the specific areas of the CT scan that led to its decision, which is crucial for trust in AI-powered medical diagnoses. • Improved Patient Safety: This technology helps to prevent misdiagnoses based on manipulated images, improving patient safety in the context of digital healthcare. Read More...
Attention Based Deep Learning
• AI for Healthy Crops: This project uses a sophisticated AI model (attention-augmented multiscale CNNs) to accurately detect diseases and nutrient deficiencies in cotton and date palm plants. • Seeing the Big and Small Picture: The AI model can analyze both fine and coarse details of plant health, allowing for more accurate diagnoses. • High Accuracy: The model achieves high accuracy in detecting plant problems, with 96.42% accuracy for cotton and 94.93% for date palms. • Better Yields: This technology can help farmers identify and address plant problems early, leading to healthier crops and improved yields. Read More...
Deep Learning and RAG
• AI for Ayurveda: This project uses a powerful AI system (LLMs with RAG) to make it easier to access and use the vast knowledge of Ayurveda. • Improved Access to Knowledge: The system helps doctors and researchers easily find and use relevant information from ancient Ayurvedic texts. • Accurate and Personalized Information: The system provides accurate and personalized information, tailored to specific patient needs and conditions. • Bridging Tradition and Modern Medicine: This technology can help bridge the gap between traditional Ayurveda and modern medicine, leading to better healthcare outcomes. • Future of Ayurvedic Care: This system has the potential to revolutionize Ayurvedic care by providing a more accessible and efficient way to use this ancient system of medicine. Read More...
Computer Vision
• AI Detects Fake Medical Images: This project uses a powerful AI model (CNN with CBAM) to identify fake medical images, such as manipulated CT scans. • Focused Approach: The AI model intelligently focuses on the most important details in the images to accurately spot the fakes. • High Accuracy: The model achieves a very high accuracy of 97.12% in detecting fake medical images. • Improved Patient Safety: This technology helps ensure the accuracy of medical diagnoses and improves patient safety by preventing misdiagnoses based on manipulated images. • Reliable Solution: This approach provides a fast and reliable solution for detecting fake medical images, contributing to trust in digital healthcare systems. Read More...
Deep Learning
• AI for Seizure Detection: This project uses a cutting-edge AI model (NeuroAttention-Net) to accurately detect epileptic seizures from brainwave recordings (EEG). • Focus on Key Signals: The AI model intelligently focuses on the most important patterns in the brainwave data to improve seizure detection accuracy. • High Accuracy: This approach achieves a remarkable accuracy of 99.3% in identifying seizure activity. • Reduced False Alarms: The model minimizes false alarms, providing more reliable and trustworthy results. • Improved Patient Care: This technology has the potential to significantly improve the diagnosis and management of epilepsy, leading to better patient outcomes. Read More...
Deep Learning and Computer Vision
• AI for COVID-19 Severity: This project uses advanced AI (deep soft attention networks) to accurately assess the severity of COVID-19 infections in chest X-ray images. • Focus on Key Areas: The AI model intelligently focuses on the most important parts of the X-ray images, such as lung abnormalities, to make more accurate diagnoses. • Improved Accuracy: This approach achieves a high accuracy of 87.33% in assessing COVID-19 severity. • Faster & More Reliable: It can help doctors quickly and reliably assess the severity of COVID-19, especially in resource-limited settings. • Broader Applications: This technology has the potential to improve the diagnosis and management of other diseases using medical images. Read More...
Deep Learning and Computer Vision
• AI for Spinal Diagnosis: This project uses a cutting-edge AI model (SDAG-CNN) to analyze X-ray images and accurately diagnose various spinal lesions. • Improved Accuracy: The AI model achieves a high accuracy of 93.33% in identifying different types of spinal problems. • Faster & More Reliable: Compared to traditional methods, this AI-powered approach is faster and more consistent, reducing reliance on individual doctor's expertise. • Efficient & Portable: The model is designed to be compact and efficient, making it suitable for use in resource-limited settings. • Broader Applications: This technology has the potential to improve diagnosis for various medical imaging tasks beyond spinal problems. Read More...
Machine Learning
• The model introduces a machine learning pipeline, designed to utilize demographic data and crucial urinary biomarkers for a holistic approach to pancreatic cancer detection. • The work underscores the pivotal role played by urinary biomarkers in advancing the early identification of pancreatic cancer, aligning with the primary focus on these biomarkers. • Early Detection: Utilizes urinary biomarkers to detect PDAC at an early stage. • Two-Stage Machine Learning: Differentiates PDAC from benign cases and further classifies PDAC into grades I–IV. • High Accuracy: Optimized SVM classifier achieves 99.7% accuracy. • Enhances early diagnosis and treatment precision. Read More...
• Proposes a hybrid activation function (ReLU + ELU) to overcome limitations of existing functions. • Addresses issues like dying ReLU and vanishing gradients. • Demonstrates improved performance on CNN and MLP models using MNIST & FashionMNIST datasets. • Shows better convergence and learning efficiency compared to standard activation functions. Read More...
Computer Vision, Transfer Learning
• Uses VGG16 with multi-head attention for potato leaf disease detection. • Identifies Early blight and Late blight from leaf images. • Achieves 91% accuracy and F1-score of 0.9103. • Enhances feature learning by capturing spatial relationships in images. Read More...
Computer Vision and Medical Imaging
• Proposes MSAN-Net, a multi-scale attention network for Alzheimer’s detection. • Uses MRI brain images to classify four stages of dementia. • Demonstrates superior performance compared to existing methods. • Supports early diagnosis to help slow disease progression. Read More...
Deep learning, Computer Vision
• Proposes a novel Attention-based Transfer Learning framework (ASPPNet) for automated PCOS diagnosis using ultrasound images. • Integrates ResNet-18 for deep feature extraction with an attention module to capture disease-specific patterns. • Employs Spatial Pyramid Pooling (SPP) to preserve multi-scale features and handle variable image sizes effectively. • Enhances robustness against spatial distortions and scale variations in ultrasound imaging. • Achieves high diagnostic accuracy of 98.79%, outperforming existing state-of-the-art models. • Enables early detection of PCOS, supporting improved clinical decision-making and preventive healthcare. • Demonstrates potential for real-world deployment in medical imaging systems and AI-assisted diagnostics. Read More...
Computer Vision, Medical Imaging
• Proposes AMCNN, an attention-based multiscale CNN for PCOS detection. • Uses dilated convolution + attention mechanism for efficient feature extraction. • Achieves high accuracy of 98.79%, outperforming existing models. • Designed for early diagnosis using ultrasound images. Read More...
Deep Learning, Neural Networks & Time Series Analysis
• Proposes SeaNet-1 (LSTM + GRU) and SeaNet-2 (transformer-based) models for SST forecasting. • Achieves strong performance with 95.16% R² score and low RMSE (27.28%). • Offers high accuracy with improved computational efficiency over existing models. • Useful for climate studies, oceanography, and environmental decision-making. Read More...
Time Series, Deep Learning
• Proposes SZ-RAN, a lightweight deep learning model using residual connections + attention mechanism. • Utilizes EEG signals for early detection of schizophrenia (SZ). • Achieves very high accuracy of 99.9% on the IBIB-PAN dataset. • Demonstrates fast convergence and superior performance over existing methods. Read More...
Computer Vision
• Proposes a hybrid approach combining deep features + RI-LBP handcrafted features for melanoma detection. • Uses XGBoost classifier for effective skin lesion classification. • Achieves strong performance with AUC of 0.91 in distinguishing malignant vs. benign lesions. • Aims to assist dermatologists by improving accuracy and early detection of melanoma. Read More...
Machine Learning (NLP / Transformer-based)
• Applies transfer learning using pre-trained models (ResNet50, VGG16, InceptionV3) for ISL recognition. • Uses ensemble learning with weighted voting to improve prediction accuracy. • Addresses challenges like limited datasets and high variability in ISL signs. • Demonstrates enhanced performance across multiple evaluation metrics. Read More...
Computer Vision, Deep Learning
• Proposes ARINet, a dual self-attention residual-inception network for plant disease detection. • Combines multi-scale, channel attention, and residual connections with fewer parameters. • Achieves 77.12% (Cassava) and 98.92% (Rice leaf) accuracy. • Outperforms existing models with higher efficiency and suitability for lightweight deployment. Read More...
Machine Learning
• Uses a transformer-based BERT model with self-attention to detect depression from text data. • Analyzes linguistic and cognitive patterns like sentiment, emotions, and language usage. • Achieves high accuracy of 96.86%, outperforming other ML and LSTM models. • Shows strong potential for early detection and intervention in mental health assessment. Read More...
Computer Vision, Deep Learning
• Introduces DMRBNet, a novel CNN with dilated multi-scale residual blocks for MRI-based breast cancer detection. • Aims to reduce unnecessary biopsies by improving diagnostic accuracy. • Achieves 98.57% accuracy with a very low error rate (0.1005). • Outperforms existing methods, showing strong potential for clinical and industrial applications. Read More...
Time Series, Deep Learning
• Proposes a novel A-VGGRI (attention-based VGG residual-inception) model for EEG-based depression classification. • Uses EEG signals + PHQ-9 scores to distinguish healthy vs. Major Depressive Disorder (MDD) patients. • Achieves high performance with 96.35% accuracy and 0.96 AUC. • Demonstrates strong potential for clinical diagnosis and real-world medical applications Read More...
Copyright © Amity Centre for Artificial Intelligence, All Rights Reserved.
Amity University Campus, Sector-125, Noida