Articles in this Volume

Research Article Open Access
A Review of the Applications and Frontier Progress of Deep Learning in Human Behavior Prediction: A Critical Examination
Human behavior correct modeling is very essential for next gen intelligent system, but it still has a lot of practical difficulties to be addressed even though it can deal with high dimensional data better than traditional Machine Learning (ML). Multimodal data fusion, loss of long-term temporal information, lack of interpretability. These are the problems. This paper looks into how deep learning is used in three main parts: feeling figuring out, social interaction patterns, and everyday actions watching. These findings reveal a distinct ket-set: structures like Long Short-Term Memories (LSTMs) and Convolutional Neural Netwroks (CNNs), they kasper tasks simple things really good, their ability add the exact randomness catching real societies is harder though. As well as that, Graph Neural Networks (GNNs) can model relationships but have computational scalability issues. Finally, this study recommended using Explainable Artificial Intelligence (XAI) along with privacy preserving method to solve the problem of “black box” and fill the gap between practical performance and reliable application of XAI.
Show more
Read Article PDF
Cite
Research Article Open Access
Review on Application of Image Super-Resolution Technology Based on Deep Learning
Article thumbnail
With the rapid development of information and intelligent technology, high-resolution images play an increasingly important role in medical imaging, remote sensing monitoring, security monitoring, intelligent transportation and industrial detection. Traditional image super-resolution (SR) methods have limitations in improving image detail and visual quality, while the introduction of deep learning has brought new breakthroughs in image super-resolution. In this paper, the development status and application progress of image super-resolution technology based on deep learning are systematically summarized. From the basic principle of super-resolution, mathematical model, benchmark dataset and performance evaluation index, the functions and advantages of convolutional neural network (CNN), upsampling method, residual network, sharpening layer, structure reparameterization and Patch technology in image reconstruction are analyzed. The typical applications of super-resolution technology in medical imaging, security monitoring, satellite remote sensing, high-definition video and image editing are further discussed, and its important value in detail recovery, vision enhancement and intelligent recognition is revealed. Finally, the main challenges facing current super-resolution technologies are summarized, including real degradation modeling, model complexity, perceptual quality balance and insufficient generalization ability, and the future development trends are prospected, such as lightweight network design, multimodal fusion, Transformer and diffusion models. This paper aims to provide systematic theoretical summary and application reference for deep learning-driven image super-resolution research, and promote the implementation and innovation of this technology in more practical scenes.
Show more
Read Article PDF
Cite
Research Article Open Access
Research on Low-Power Optimization Methods and Software-Hardware Co-design Strategies for Embedded Systems
Article thumbnail
As the clock frequency and integration density of embedded systems continue to rise, especially with the widespread use of mobile embedded devices such as smartphones and Personal Digital Assistants (PDAs), power consumption has become a critical challenge in system design. Besides, the slower slower advancement of battery technology compared to computational power makes system-level optimizations to reduce energy consumption essential for enhancing performance and increasing device longevity. Therefore, this paper explores methods for power consumption optimization in embedded systems, examining the potential for low-power solutions via the collaboration of hardware and software design. By reviewing recent research and investigating the findings of Tiwari et al., which involved simulations and typical embedded system setups, this study offers a comprehensive analysis of how to optimize power consumption from multiple angles. The results demonstrate that software-level optimizations, such as algorithm scheduling and instruction optimization, can greatly reduce system power usage. Furthermore, software-hardware co-design is identified as the key direction for the future development of low-power embedded systems, enabling high energy efficiency without compromising performance.
Show more
Read Article PDF
Cite
Research Article Open Access
CNN–GRU Hybrid Network for Predicting sgRNA On-Target Activity in Bacteria
Article thumbnail
The precise prediction of CRISPR–Cas9 single-guide RNA (sgRNA) on-target activity is important for designing an efficient gene editing experiment. Even though there is a large amount of deep learning models that have been raised and proved. When these models are transferred to other species like bacteria, the performance will decrease significantly. There is fundamental difference between genome background and DNA repair/selection. Based on multiple Cas9 variants, we propose CNNGRUHybrid, a hybrid neural architecture that (i) extracts local motif features using multi-scale 1D convolutions, (ii) models longer-range dependencies using stacked bidirectional GRUs (BiGRUs), and (iii) aggregates position-wise representations via a lightweight attention-style pooling mechanism. This model uses one-hot encoded 43-nt or 28-nt windows as input, and incorporates numeric or tabular features optionally like GC content and assay-derived descriptors. We further define and compare our pretraining–finetuning regimes across datasets/variants. The results of our experiment comparing CNNGRUHybrid with representative CNN based models show consistent improvements in rank-based metrics (e.g., Spearman correlation) on bacterial sgRNA activity prediction tasks.
Show more
Read Article PDF
Cite
Research Article Open Access
A Review of Research on Radiofrequency Technology for Delaying Skin Aging
Article thumbnail
Skin aging is an external manifestation of aging that occurs with increasing age, triggered by multiple mechanisms. Its main characteristics include skin laxity, increased wrinkles, and volume loss. Maintaining or improving youthful skin has become an increasingly important focus. With ongoing research, various anti-aging technologies have emerged. Radiofrequency (RF) technology, due to its ease of operation, high safety, and relatively long-lasting effects in facial rejuvenation treatments, has become an important cosmetic treatment method. Studies show that various RF technologies, ranging from monopolar to multipolar, can effectively reduce wrinkles, improve sagging skin, and enhance skin texture. However, there are limitations, including insufficient efficacy in some cases. This article aims to systematically review the latest developments in the appli cation of RF technology for skin anti-aging, providing a reference for enhancing its clinical efficacy.
Show more
Read Article PDF
Cite
Research Article Open Access
Research on Automatic Modulation Recognition of Medium-SNR Signals Based on Deep Learning
Article thumbnail
With the rapid development of electronic technology and increasing requirements on signal utilization efficiency and low bit error rates, diverse signal modulation schemes have emerged to accommodate diverse scenarios, including electronic countermeasures and military competition. The growing complexity of modulation schemes has made modulation recognition increasingly challenging, positioning deep learning-based automatic modulation recognition as a major research focus. This study uses the RadioML 2018.01A dataset from Kaggle to evaluate modulation recognition performance under signal-to-noise ratios ranging from 0 dB to 12 dB. A conventional convolutional neural network (CNN) and a residual neural network (ResNet), both commonly employed in image classification, are adopted and compared. The objective is to identify the modulation scheme used by the signals in the dataset and classify them into the corresponding categories. For certain modulation schemes, such as 64QAM and 128QAM, the residual neural network achieves classification accuracies that are up to approximately 20% higher than those of the convolutional neural network. Although the recognition accuracy of ResNet for some modulation schemes is slightly lower than that of the CNN, it remains within an acceptable range. Experimental results demonstrate that under medium signal-to-noise ratio conditions, the residual neural network outperforms the convolutional neural network.
Show more
Read Article PDF
Cite
Research Article Open Access
From Deep Learning to Large Models: Detection Methods for AI-Generated Content and Deepfakes
The rapid evolution from traditional deep learning to large-scale foundation models has revolutionized content gen- eration while introducing unprecedented challenges for detecting AI-generated content (AIGC) and deepfakes. This paper comprehensively analyzes the detection technologies across the paradigm shift from GANs to modern diffusion models and large language models (LLMs). We systematically categorize detection methods across generation eras, modalities (text, image, video, audio), and technical approaches. Our analysis reveals that traditional methods designed for GAN-generated content exhibit catastrophic failure on modern diffusion and LLM outputs. We examine state- of-the-art techniques such as watermarking schemes, zero- shot detection methods like DetectGPT, reconstruction-based ap- proaches (DIRE), and multimodal verification. Furthermore, we analyze critical challenges related to cross-model generalization, adversarial robustness, and computational efficiency, providing practical guidance for method selection and deployment.
Show more
Read Article PDF
Cite
Research Article Open Access
A Review on the Application of Optical Character Recognition (OCR) in Robotics
Optical Character Recognition (OCR), once primarily associated with scanned document processing, has evolved into a pivotal perceptual competence for robots operating in human-centered environments. This review synthesizes the evolution, integration, applications, and lingering challenges of OCR technology in robotics. The paper traces the paradigm shift from classical handcrafted methodologies—including stroke-based and region-based techniques—to contemporary deep learning architectures, encompassing convolutional, transformer-based, and vision-language models. It further investigates the integration paradigms of OCR within robotic perception pipelines across heterogeneous platforms, including mobile robots, manipulators, autonomous vehicles, and aerial drones. Real-world deployment domains—such as logistics automation, service robotics, medical assistance systems, autonomous driving, and infrastructure inspection—are elaborated to elucidate the practical efficacy and deployment constraints of robotic OCR. Particular attention is paid to robotics-specific challenges, including motion blur, extreme viewpoints, environmental degradation, limited onboard computation, and safety-critical latency requirements. Despite substantial progress in recent years, robust and real-time scene text understanding in unconstrained real-world environments remains an open research frontier. Finally, the review identifies promising future research directions aimed at enabling more reliable, efficient, and context-aware reading capabilities for robots in real-world scenarios.
Show more
Read Article PDF
Cite
Research Article Open Access
AF-UNet: Asymmetric Fusion U-shaped Convolutional Networks for Retinal Vessel Segmentation
Article thumbnail
Accurate segmentation of retinal vessels is very important for computer-aided diagnosis and screening of diabetic retinopathy, hypertension, glaucoma and other eye diseases. The U-Net structure based on Convolutional Neural Networks (CNNs) is excellent in the feature extraction and segmentation of retinal blood vessels. However, due to the complex structure of blood vessels in retinal images, low contrast and the interference of diseased areas, it is still difficult to accurately segment small blood vessels and marginal blood vessels by existing methods. Therefore, it is still a big challenge to realize automatic segmentation with high precision and accuracy. In this paper, we propose a new deep learning model called Asymmetric Fusion UNet (AF-UNET), which is improved on the basis of U-Net framework and is specially used to segment retinal vessels. In this model, asymmetric jump connection is added between encoder and decoder, and the shallow features of encoder are integrated into the full-scale features of decoder. The advantage of this design is that it can carry out multi-scale feature fusion, focusing on tiny vascular structures, and avoiding the problem of too dense connections, making the whole network lighter. We have done experiments on public retinal image data sets, and the results show that AF-UNet performs well on F1-score, recall and Intersection over Union(IoU), which is stronger than those classical methods based on U-Net and attention mechanism. In addition, the visualization results also prove that AF-UNet has obvious advantages in maintaining vascular connectivity and accurately segmenting small vascular structures.
Show more
Read Article PDF
Cite
Research Article Open Access
The Application and Prospect of Multi-modal Information Fusion in Human Action Recognition
Article thumbnail
The recognition of unimodal behavioral postures is easily influenced by the environment, such as lighting, occlusion, etc. Therefore, most of the existing action recognition focuses on the multimodal field. Therefore, this article selects two typical methods at present for comparison, analysis and summary. Specifically, two methods, namely the dual-stream cross-modal fusion transformer and the model-based multimodal neural network, were selected for research. Among them, the core contribution of the dual-stream cross-modal fusion transformer is to achieve the fusion of RGB and depth information through modal enhancement and modal interaction. Model-based multimodal neural networks integrate OpenPose skeleton data with RGB frames and employ graph convolution and spatio-temporal ROI fusion. These two methods present two completely different approaches to action recognition in multimodal situations. In addition, this article also presents new research directions for the future,Such as lightweighting, etc. And a specific application analysis was provided.
Show more
Read Article PDF
Cite