Volume | Applied and Computational Engineering

Research Article Open Access

Published 12 January 2026 DOI: 10.54254/2755-2721/2026.TJ31146

A Review of Hybrid Models Combining Convolutional Neural Networks and Vision Transformers in Medical Image Processing

Chade Li

Medical image processing is a very important role in modern healthcare diagnosis and treatment. However, traditional manual analysis faces challenges like high variances, low efficiencies and low accuracies. Recently, deep learning techniques like Convolutional Neural Networks (CNNs) have rapidly improved and achieved remarkable success and improvements in areas like medical image classification, segmentation, and detection tasks due to their powerful feature extraction capabilities. Nevertheless, CNNs exhibit limitations in modeling global contextual information and rely heavily on large-scale annotated datasets. The emergence of Vision Transformers (ViTs) offers a new perspective by effectively modeling global image features through self-attention mechanisms. Hybrid models that combine the strengths of CNNs and Transformers have thus become a research hotspot. This paper aims to make reviews for fusion methods between CNN and Transformers in medical image processing, including typical strategies such as early fusion, intermediate fusion, and late fusion, and summarizes their application performance and advantages in various tasks. Experimental results show that hybrid models are able to show better performance than areas like single-architecture models in terms of accuracy, generalization ability, and adaptability to complex tasks. Finally, this paper discusses the future challenges faced by hybrid models in terms of data scarcity, computational efficiency, and interpretability, and outlines future research directions.

Read Article PDF

Cite

Research Article Open Access

Published 12 January 2026 DOI: 10.54254/2755-2721/2026.TJ31147

Investigating Foreground Background Separation in Vision Transformers for Image Classification

Bowen Jiang

Image categorization is a key part of computer vision. It may be used for a wide range of things, from self-driving cars to medical imaging. Most traditional methods look at an image as a whole, which makes it hard for them to pick up on the different contributions of front items and background context. Contour information, which is very important for finding important structures, is often not used enough. Vision Transformers (ViTs) and other recent developments have greatly improved classification performance, but they still depend on a single representation of the image. In this study, we investigate the potential advantages of deliberately segregating picture components for categorization purposes. We suggest a dual-stream ViT framework that works with foreground and background areas separately before putting their representations together. The experimental results indicate that the suggested dual-stream model does not surpass the performance of ordinary single-stream ViTs, but rather demonstrates equivalent efficacy across several benchmarks. More examination shows that the fundamental problem is that it is hard to separate the foreground from the backdrop. In complicated or messy scenarios, improper region extraction adds noise that makes the dual-stream approach less useful. These results show that component-aware designs have a lot of potential, but their success depends a lot on how well foreground–background segmentation works, which is still a big problem for the future.

Read Article PDF

Cite

Research Article Open Access

Published 12 January 2026 DOI: 10.54254/2755-2721/2026.TJ31187

Deep Q-network in the Iterated Prisoner's Dilemma under Noise

Yixiao Chen

In game theory, there is a fundamental challenge about maintaining cooperation among selfish players, especially under practical noise. This study applies a noisy Iterated Prisoner’s Dilemma (IPD) model to investigate how learning strategies perform against classical strategies when players may receive false or misleading signals due to random observation errors. More specifically, this study compares Deep Q-Network (DQN) agents with basic Q-learning (QL) and several classical strategies such as Tit-for-Tat, Win-Stay-Lose-Shift, and Grudger. The experiment results show that when noise emerges, DQN agents not only achieve higher cumulative rewards than other strategies but also maintain more stability, adaptability, and resilience across repeated interactions. DQN agents’ deep neural structure helps them to capture long-term temporal dependencies, effectively differentiate accidental defections from intentional ones, and recover cooperation after disturbances by noise. These findings indicate that deep reinforcement learning is effective in noisy and imperfect settings. The findings also offers valuable insights for understanding the emergence of cooperation and for designing robust multi-agent decision-making mechanisms in noisy or uncertain environments.

Read Article PDF

Cite

Research Article Open Access

Published 12 January 2026 DOI: 10.54254/2755-2721/2026.TJ31188

Enhancing Youth Engagement with Chinese Ancient Instruments Through Interactive Digital Tools and Pop Music Fusion

Yijun Chen

This research addresses how an interactive digital tool might emotionally connect the youth with their culture, concentrating on one of the oldest and richest traditions of China - ancient music. The application enables the user to be the instrumentalist of the virtual world through playing the guqin, bianzhong and some popular songs as well, all this while getting step-by-step instructions, a scoring feature, a playback feature, and cultural insights presented in the form of trivia tied to high scores. The purpose of the app is to use the emotional and cultural aspects to bring the past into the young people's present time and thus get them more involved. It was the systematic measures of cultural and musical engagement together with emotional reactions to learning and music that I used for the research whose participants were thirty-six young adults aged 18 to 25 and who completed a 14-day study, where the influence of the traditional cultural activities was compared with this interactive gamified method. The data imply that the app users practice the culture more deeply than other users, so interactive digital instruments can become a powerful tool to revive cultural heritage and, at the same time, make it more fun and educational. The research outcome can shape not only the future of cultural preservation but also that of education.

Read Article PDF

Cite

Research Article Open Access

Published 20 January 2026 DOI: 10.54254/2755-2721/2026.TJ31353

An Improved Grey Wolf Optimizer Based on Elite Opposition-Based Learning Strategy and Non-Linear Parameters

Jiayi Miu

The Grey Wolf Optimizer (GWO) has been extensively applied in meta-heuristic optimization. However, it inherently suffers from several limitations, including inaccurate solution outputs, slow convergence speed, and a high tendency to get trapped in local optima. To resolve these problems, this study introduces two targeted improvements to the GWO algorithm. Firstly, an elite opposition-based learning method is employed for initializing the grey wolf population. This method enhances the diversity of initial individuals, reinforces the algorithm’s global search ability, and accelerates convergence in the early iteration stage. Secondly, nonlinear parameters are incorporated into both the prey encircling and attacking processes of GWO. This modification expands the algorithm’s search scope during the early iterations. Ten benchmark test functions with distinct characteristics were used to validate the improved algorithm (IEN-GWO), which was compared with five well-recognized meta-heuristic algorithms. The experimental results demonstrate that IEN-GWO outperforms the compared algorithms in terms of solution precision, stability, and convergence rate.

Read Article PDF

Cite

Research Article Open Access

Published 20 January 2026 DOI: 10.54254/2755-2721/2026.TJ31299

A Survey of Deep Time Series Forecasting with Spectral Analysis

Hengxu Lai

Time series forecasting is a core technology in critical domains such as energy dispatch and traffic management, yet its performance is hindered by challenges including long-term dependencies, multi-scale structures, and data non-stationarity. In recent years, integrating spectral analysis with deep learning has emerged as a significant trend for improving forecasting accuracy and efficiency. This paper systematically reviews progress in this field by introducing a challenge-oriented classification framework encompassing four dimensions: long-term dependency modeling, multi-scale feature extraction, lightweight design, and multi-task general-purpose modeling. Within this framework, we conduct a comparative analysis of representative methods, including Autoformer, FEDformer, and TimesNet, among others. These methods enhance modeling capacity for complex temporal patterns through mechanisms such as spectral sparsification, adaptive frequency filtering, and temporal multi-periodic transformations. We evaluate methods on eight mainstream benchmark datasets across multiple forecast horizons (96–720 steps). Results demonstrate that spectral sparsification and memory filtering mitigate error accumulation in long-term forecasting; multi-scale decomposition structures balance short-term fluctuations and long-term trends; and lightweight linear models achieve superior parameter efficiency on high-dimensional stationary data. By synthesizing technical pathways and scenario-based comparisons, this study offers practical guidance for model selection in engineering applications. Finally, we outline future research directions, including time-varying period detection and joint time-frequency representation, to enhance the robustness of forecasting models in non-stationary environments and facilitate their real-world deployment.

Read Article PDF

Cite

Research Article Open Access

Published 20 January 2026 DOI: 10.54254/2755-2721/2026.TJ31321

Application of Statistical Model Based on Artificial Intelligence in Image Recognition

Yize Jiang

The development of image recognition reflects the evolution of the technology paradigm. This paper systematically combs the evolution of traditional statistical models and deep learning models, analyzes the core technologies of the fusion path such as Bayesian neural network and deep integration, and discusses its application value in high-risk scenarios including medical image diagnosis, automatic driving and industrial defect detection. It also points out key challenges like computational efficiency, unified evaluation standards and model calibration. The study focuses on solving the overconfidence and lack of explicability of deterministic deep learning models, and finds that fusing AI with statistical models to realize "probability prediction" is an effective solution. Different fusion technologies have their own advantages and need scenario-based trade-offs. The research provides theoretical reference for constructing trustworthy image recognition systems and points out future directions such as efficient algorithm development and cross-research with interpretability AI.

Read Article PDF

Cite

Research Article Open Access

Published 20 January 2026 DOI: 10.54254/2755-2721/2026.TJ31360

Interaction-Enhanced and Explainable Machine Learning for Diabetes Risk Prediction

Zhiyuan Chen

Diabetes mellitus is a common chronic metabolic disease for which early diagnosis is crucial for prevention and treatment. As the amount of structured clinical information grows, machine learning has emerged as a valuable instrument to predict diabetes risk; nevertheless, several studies demonstrate the application of machine learning models to diabetes risk prediction based on original clinical features, while highlighting a general lack of systematic inspection of feature combinations and model interpretability. Various machine learning models have been built and tested in this research to predict diabetes risk using structured clinical data. Clinically motivated interaction terms were built to capture nonlinear physiological relationships, and a two-criterion selection approach using tree-based split gain and SHAP importance was used to identify meaningful interactions. An interaction-enhanced XGBoost model was then trained and compared with baseline and complete-interaction models using conventional classification metrics. The results of the experiment indicate that the noise created by indiscriminate inclusion of interaction features can lead to degraded generalization performance and that selectively retained interactions can increase sensitivity without compromising the discriminative performance. Glucose, BMI, and age were also identified as dominant predictors and verified in diabetes prediction through feature ablation analysis. Furthermore, the SHAP interpretability analysis presented clear and clinically coherent model behavior explanations. Overall, the developed framework implies that there is a sensible trade-off between predictive efficiency and interpretability, which highlights the importance of focused feature interaction modeling of reliable predictive and explainable diabetes risk assessments.

Read Article PDF

Cite

Research Article Open Access

Published 26 January 2026 DOI: 10.54254/2755-2721/2026.TJ31449

The Effectiveness of mHealth Intervention in Improving Sleep Health for Adolescents and Young Adults: A Scoping Review

Yiyun Jin

Insufficient sleep is one of the most significant health concerns around the world since it could increase the risk of several chronic diseases, including diabetes, behavioral problems, and obesity. In recent years, mobile health (mHealth) technologies have been widely adopted for health monitoring and improvement. This scoping review investigates the effectiveness of mHealth intervention in improving sleep health for adolescents and young adults. This review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) standard. Studies were searched in PubMed, IEEE Xplore, ACM Digital Library and Cochrane databases. The scoping review included studies employing at least one type of mHealth intervention for sleep, reporting at least one evaluation parameter for sleep health, conducting experiments with participants aged 10-25 years old. A total of 499 studies were identified and 18 studies met the inclusion criteria. Based on the findings, the majority of studies utilized subjective self-report questionnaires or diaries (N=14 of 18), whereas only four studies detected the sleep health via objective wearable sensors. More than half of the included studies (N=10 of 18) in this scoping review reported their mHealth interventions have significant impacts on at least one sleep health parameter. The CBT-I app- based mHealth interventions and digital reminder approaches have the most consistent improvement, reporting significant effects on sleep health outcomes such as sleep quality, efficiency, and insomnia symptoms. This scoping review provides a overview of the current state and limitations of mHealth intervention in improving the sleep for adolscents and young adults. One of the limitations in the article demonstrates the need for a standardized sleep outcome measurement in order to increase the comparability among different mHealth interventions and studies.

Read Article PDF

Cite

Research Article Open Access

Published 26 January 2026 DOI: 10.54254/2755-2721/2026.TJ31432

Construction of a Machine Learning Pipeline Based on NMR Data: Analysis and Determination of the Primary Structure of Single-Stranded RNA

Shi Chen

This study aims to establish a machine learning pipeline for determining and analyzing the primary structure using NMR data of single-stranded RNA. The pipeline consists of two steps, stage 1 uses an RNA binary classification model, and stage 2 uses an A/U/G/C four-classification recognition and sorting model. During the experiment, the single-stranded RNA NMR data collected from BMRB and NP-MRD data sources were processed using category imbalance calculation and SMOTE oversampling methods. Models such as random forest and gradient boosting, 5-fold cross-validation, Wilson score confidence interval, and generalization ability evaluation were used to determine the generalization ability of the models and whether overfitting occurred. Results show that the best model for stage1 is the Random Forest model (with 30 features)with an accuracy rate of 90.48%; the best model for stage2 is the Gradient Boosting model (100 trees, depth 5, learning rate 0.1) with an accuracy rate of 96.30% on the independent test set. And in the feature engineering of the stage2 model, four H6-H5 difference features were added, which cut down the confusion between C and U and improved the accuracy of the model. This machine learning pipeline can predict RNA sequences of 8-20 nucleotides based on NMR data.

Read Article PDF

Cite

Articles in this Volume