Volume 242 | Applied and Computational Engineering

Research Article Open Access

Published 25 May 2026 DOI: 10.54254/2755-2721/2026.GU33832

Ten Years of Deep Learning: The Evolution and Challenges from Convolutional Networks to Generative Models

Qiyuan Gui

In the last ten years, deep learning has transformed artificial intelligence, leading to significant advancements in computer vision, natural language processing, and speech recognition. Deep learning technologies have consistently advanced the boundaries of AI, exemplified by the exceptional efficacy of Convolutional Neural Networks (CNNs) in image recognition, the robust functionality of Recurrent Neural Networks (RNNs) in processing sequential data, and the remarkable ingenuity of generative models in producing images and text. However, alongside rapid technological advancements, deep learning faces numerous challenges, such as poor model interpretability, low data upload efficiency, weak generalization capabilities, and high computational resource consumption. This article reviews the development of deep learning over the past decade, focusing on the technological evolution of CNNs, RNNs, and generative models, and explores feasible solutions to these challenges. The research methodology combines literature review and case analysis: systematically process key literature in the deep learning field and analyze specific cases to better reveal the internal logic and future trends of deep learning technology. This article aims to help readers process the development skeleton of deep learning, identify research gaps in certain areas, and use this information to better predict future trends. The study is based on a comprehensive review of seminal papers and practical example, utilizing tools such as bibliometric analysis and case study frameworks. The data is sourced from prominent academic databases and real-world applications. The findings highlight the significant advancements and ongoing challenges in deep learning, providing insights into potential future directions and areas for further research.

Read Article PDF

Cite

Research Article Open Access

Published 25 May 2026 DOI: 10.54254/2755-2721/2026.GU33833

Review of FPGA Hardware Acceleration and Co-optimisation of Hardware and Software for Artificial Intelligence

Suyao Wang

The rapid development of deep learning poses an unprecedented challenge to computing power. Constrained by the memory wall and energy efficiency bottlenecks, the traditional von Neumann architecture struggles to meet the diverse deployment requirements of AI applications in cloud and edge scenarios. Field Programmable Gate Array (FPGA) is suitable for accelerating neural network inference due to its unique advantages of high energy efficiency, low latency and hardware reconfigurability. This paper systematically reviews FPGA-based AI software-hardware co-optimisation technologies. Firstly, it sorts out the evolution process of the underlying architecture, from DSP-based multiply-add operations to on-chip memory hierarchy and Near-Data Processing (NDP). Secondly, it discusses the core strategies of software-hardware co-optimisation, including lightweight algorithms such as model quantization and structured pruning, as well as hardware design strategies such as pipelining and double buffering mapping. Then, it analyses the deployment in typical scenarios such as edge real-time detection and extreme environments. Finally, it discusses the memory bandwidth challenges and compilation toolchain barriers brought by large language models. In summary, based on the unique advantages of FPGA reconfigurability in adapting to algorithms, architectures and scenarios, this paper depicts a new paradigm of software-hardware collaboration of "software-defined hardware" based on FPGA, and prospects its great potential in achieving a system-level optimal balance among energy efficiency, flexibility and reliability in extreme scenarios.

Read Article PDF

Cite

Research Article Open Access

Published 25 May 2026 DOI: 10.54254/2755-2721/2026.GU33803

Architectural Constraints in Long-Sequence LLM Training: An Analytical Perspective on Hardware–Software Co-Design

Zhenhuan Shao

With Large Language Models (LLMs) moving toward ultra-long contexts, limitations in memory capacity, bandwidth, and inter-device communication increasingly restrict training efficiency. Long-context modeling exacerbates the already high resource demands of LLMs, as both the attention mechanism and the surrounding feed-forward and normalization layers scale with sequence length. This paper analyzes long-sequence LLM training from operator-level, system-level, and hardware-level perspectives. Specifically, it highlights that IO-aware exact-attention kernels can significantly reduce HBM traffic for attention computation, but they do not mitigate the activation-memory growth in non-attention modules. When sequence length grows, the main bottlenecks become communication, synchronization, and workload imbalance in distributed sequence/context parallelism. Importantly, the exact crossover point is model-, hardware-, and implementation-dependent. Future scaling will likely demand a combination of software optimization and co-design of memory systems, packaging, and interconnects.

Read Article PDF

Cite

Research Article Open Access

Published 25 May 2026 DOI: 10.54254/2755-2721/2026.GU33779

Research Progress on Mechanical Systems of Rehabilitation Robots from the Perspective of Mechanical Engineering

Fumeng Liao

The demand for rehabilitation exoskeletons has outpaced the mechanical engineering required to make them truly wearable. Demographic aging and rising stroke incidence have expanded the candidate patient pool, yet most devices still stall at the laboratory door because their physical hardware cannot reconcile the conflicting requirements of daily use: every gram saved from the limb tends to cost torque, every fixed-stiffness transmission eventually misaligns with anatomical joints, and every rigid actuator introduces a failure mode that exposes the patient to injury. These problems are not independent; they trace back to a single design tension—structure, drive, and sensing are still optimized separately rather than treated as one coupled interface. This review therefore narrows its scope to the mechanical layer, tracing recent advances in biomimetic structures, actuation architectures, and embedded sensing integration. Rather than covering control algorithms or clinical trials in isolation, this paper examines how structural choices constrain drive options and how drive options, in turn, dictate what sensing and feedback can realistically achieve. Particular attention is given to two under-addressed engineering gaps: long-term wearing comfort under cyclic loading, and the accumulation of nonlinear kinematic errors across multi-joint chains. The next generation of hardware will need to move beyond single-subsystem optimization toward adaptive stiffness, intrinsic mechanical safety, and high physical integration—shifts that are as much about design culture as about component performance.

Read Article PDF

Cite

Research Article Open Access

Published 25 May 2026 DOI: 10.54254/2755-2721/2026.GU33780

Robust Control Methodologies and Applications for Robotic Systems

Wei Zhou

In performing tasks in complex unstructured settings, robots often face model uncertainties, external perturbations, actuator nonlinearities and failures. Regarded as the primary tool in ensuring that the system is stable and the precision of tracking is achieved, robust control has developed in a rapid way in recent years. In this paper, the current developments in robust control of robotics are reviewed in a systematic manner, in which the most advanced methods are divided into five groups based on theoretical backgrounds: H∞ control, sliding mode control, model predictive control, adaptive control, and observer-based control. The operating principles, improvement plans and common uses of all these approaches are explained in detail. The paper below focused on summarizing robust control designs and engineering applications of collaborative robots, mobile robots, legged robots, space robots and other robotic systems. The main issues in the modern research arise such as disturbance modeling, computational performance, multi-constraints, and experimental verification. Eventually, future perspectives orient at the combination of robust control with intelligent perception, digital twin, and reinforcement learning. It is hoped that the present paper will be helpful to the researchers who want to study and select high-precision, high-reliability and high-stability control algorithms in the current robotic systems.

Read Article PDF

Cite

Research Article Open Access

Published 1 June 2026 DOI: 10.54254/2755-2721/2026.GU33953

SLAM and Autonomous Navigation Optimization for Unstructured Environments: A Simulation Study Based on End-to-End Multimodal Fusion Algorithms

Zhonghao Yang

Unstructured Environments are dynamic and unpredictable; therefore, Autonomous Navigation for mobile robots must be robust. Addressing the problems of error coupling between the SLAM and navigation modules and poor environmental adaptability of traditional hierarchical navigation systems, this paper proposes an integrated SLAM and autonomous navigation algorithm based on end-to-end multimodal fusion, and verifies its performance with publicly available simulation data and literature findings. The algorithm first fuses heterogeneous data from LiDAR and visual sensors to extract environmental features and build a real-time map; secondly, it constructs a lightweight multimodal large-scale model that converts environmental perception features, robot pose information, and natural language navigation commands into unified feature representations, and directly outputs motion control commands; finally, it adds a geometric safety correction mechanism and an online replanning strategy to reduce collision risk and spatial-temporal alignment problems in the end-to-end algorithm. Comparative analysis with the traditional hierarchical algorithm (APF-RRT*) and existing end-to-end methods (VLA) in unstructured scenario experimental data from the Gazebo simulation platform shows that the proposed algorithm performs better: it has a 15.3% higher navigation success rate, reduced path curvature variance by 28.7%, and a real-time processing latency of ≤50ms; thus, it effectively addresses the complex challenges of unstructured environments and provides algorithmic support for mobile robot applications in field operations and post-disaster rescue scenarios.

Read Article PDF

Cite

Research Article Open Access

Published 1 June 2026 DOI: 10.54254/2755-2721/2026.GU33954

3D Object Detection for Autonomous Driving via Multi-Sensor Fusion and Deep Guidance

Yanzhe He

Autonomous driving 3D object detection is a core technology of on-board perception systems. Single-modal sensors are easily constrained by the environment and are difficult to meet the application requirements of complex scenarios. This paper focuses on the research of multi-sensor fusion and depth-guided autonomous driving 3D object detection methods, sorting out the technical status of data layer, feature layer, decision layer fusion and depth-guided monocular detection, point cloud fusion, and comparing the performance and applicable scenarios of different fusion strategies. It is analyzed that the current technology faces challenges such as the difficulty of multi-sensor temporal and spatial synchronization, insufficient robustness of depth extraction, imbalance between accuracy and real-time performance, difficulty in small sample training, and lack of engineering standardization. Looking to the future, multi-modal adaptive fusion, lightweight depth-guided algorithms, and large model empowerment will become the core development directions, which can provide theoretical references for the technical optimization and mass production of autonomous driving 3D object detection.

Read Article PDF

Cite

Research Article Open Access

Published 1 June 2026 DOI: 10.54254/2755-2721/2026.GU33959

A Comparative Study of Efficient Neural Architectures for Facial Expression Recognition

Zitong Yang, Kexing Lian, Peichun Liao, Baoqi Gu

We study facial expression recognition (FER) under the twin constraints of low-resolution inputs and on-device inference. We benchmark compact convolutional baselines—ResNet-50/152—and an attentionenhanced ResNet-50 with squeeze-and-excitation modules and class re-weighting. We also adopt a lightweight transformer, Iwin-T, that interleaves windowed selfattention with depthwise separable convolution to balance global context modeling and local inductive bias under limited computational resources. Using classweighted cross-entropy on FER2013 with an 8:1:1 train–validation–test split, Iwin-T attains 68.09% Top1 accuracy, surpassing ResNet-50 (63.22%), ResNet152 (66.60%), and the attention augmented ResNet50 (67.19%). Beyond raw accuracy, we analyze training dynamics including loss/accuracy oscillations and training–validation divergence, and identify scheduling choices that improve stability under tight memory and compute budgets. Our findings suggest a practical guideline for edge-oriented FER: (i) a well-tuned ResNet-50 remains a robust default choice for stable deployment; (ii) channel attention and class re-weighting provide simple yet effective accuracy gains; (iii) when optimization can be carefully stabilized, Iwin-T delivers the best accuracy–efficiency trade-off for lowresolution FER tasks.

Read Article PDF

Cite

Research Article Open Access

Published 1 June 2026 DOI: 10.54254/2755-2721/2026.GU33956

Artificial Intelligence Method Applications on Developing Role-Playing Games

Junxian Guan

There are numerous inconveniences during the traditional developing stages of Role-Playing Games (RPGs), such as inflexible narrative designs and lack of immersive interactions. However, the great development of Artificial Intelligence (AI) methods like Natural Language Processing (NLP) and Machine Learning (ML), provides brand new solutions for tackling those issues happening in the RPG developing workflow. This article looks up surveys and research about connections between AI methods and RPGs of recent years, collects new approaches and summarizes them as implementations AI technologies have made in RPG scope so far. The overall contents contain three separate parts, including Procedural Content Generation (PCG), Dynamic Narrative, and Non-Player Character (NPC) interactions. Each of them is internally subdivided into different implementations depending on its usage scenarios, covering improvements in storytelling, resource management and judgement for game developers, and better immersion and gameplay optimization for players. This review shows broad prospects of how AI technology assists RPG developing with latest methodologies.

Read Article PDF

Cite

Research Article Open Access

Published 1 June 2026 DOI: 10.54254/2755-2721/2026.GU33955

Path Planning Methods for Agricultural Intelligent Robots

Hao Lu

With the rapid progress in smart agriculture, agricultural intelligent robots have been used in many stages of agricultural production, such as field operations, harvesting fruits and vegetables, and plant cultivation within facilities, which make them an important device to improve the efficiency of agricultural production and facilitate the modernization process of agriculture. As the key technology that supports the autonomous operation of agricultural intelligent robots, path planning is crucial in terms of their security, efficiency, and coverage rate. In this paper, the authors conduct a comprehensive review on the mechanisms, technologies, and applications of path planning for agricultural intelligent robots, trace back its development history from single robot autonomy to multi-robot collaboration, analyze its current challenges, and forecast the future direction. This study is expected to provide useful references for the relevant studies and engineering implementations of path planning for agricultural intelligent robots.

Read Article PDF

Cite

Articles in this Volume