Articles in this Volume

Research Article Open Access
Applications and Challenges of Tactile Sensing Technology in Robotic Grasping
The ability to grasp is what turns a robot from a passive observer into something that can act on the world—and how intelligently it grasps hinges, above all, on how well it perceives. For a long time, robotic grasping leaned almost entirely on vision. That works until it doesn't: try looking through something transparent, catching reflections off a polished surface, or making sense of an object that barely has any visible texture, and the limits of a camera become painfully clear. These persistent blind spots have pushed researchers to go beyond vision alone. This paper traces the emerging landscape of tactile-based grasping. It looks at how tactile sensors are being built and refined today, where they are already proving useful, and what breakthroughs—and roadblocks—have surfaced in real applications. With that foundation in place, the discussion then turns to the deeper challenges the field still faces. Finally, this paper peers a little further ahead, toward a new generation of tactile perception systems shaped by hardware–software co-design and by the broader push toward embodied intelligence. The aim is to map out the terrain clearly enough that others working on multimodal robotic grasping can find both orientation and inspiration.
Show more
Read Article PDF
Cite
Research Article Open Access
Macro Semantics, Micro Kinematics: Bridging VLMs and Dense Control in Embodied RL
Article thumbnail
Lately, new improvements in Embodied AI have highlighted the potential of Vision-Language Models (VLMs) for continuous control. However, this workinvestigations reveal that, by applying lightweight VLMs to high frequency Reinforcement Learning (RL) tasks, a "Zero Convergence Collapse" could be caused. This failure originates from severe spatial resolving limitations, causing flat potential-based reward shaping (PBRS) plateaus. In order to solve this, this paper propose a Hierarchical Hybrid Reinforcement Learning Architecture. This paper decouples cognitive planning from a reflexive motor control, restricting the VLM to macro semantic milestone identification while delegating micro dense adjustments to classical kinematic formulas. Results in the sparse reward environments illustrate that this hybrid approach reaches a 1.4x exploration speedup and a 100 percent convergence rate, which eliminates the high variance of blind exploration. Meanwhile, this worktrained agent succeeds in internalizing macro-semantic knowledge. Without the VLM dependency, it achieves a high speed and zero shot inference during deployment. Such progress provides a practical blueprint for the integration of foundation models into embodied AI, while maintaining a reasonable mathematical stability.
Show more
Read Article PDF
Cite
Research Article Open Access
Research on Lightweight CNN Model for MRI Image Analysis Based on Edge Impulse Platform
Article thumbnail
Brain tumors are an extremely dangerous disease that is lethal to mental health. MRI is one of the most significant methods for diagnosing this disease. To diagnose brain tumors accurately, doctors need to maintain a high level of concentration for a long time to check MRI images, which may cause visual fatigue and lead to misdiagnosis. Computer vision, machine learning, and deep learning are being introduced to medicine in a gradual way due to their development, which alleviates visual fatigue for doctors and decreases the chances of misdiagnosis. However, there are still deficiencies in lightweight model deployment and resolution adaptability. In this paper, the research is based on the Edge Impulse platform, and MobileNetv2 lightweight convolutional neural networks are used to construct an MRI images classification models. In the process of dataset preparation, the images were adjusted, and intensity normalization was performed to eliminate the impact of different MRI image parameters. To discover the impact of model capacity and input resolution, diverse input size and width multiplier (96×96 0.05, 96×96 0.1, 96×96 0.35, 160×160 0.35 and 160×160 0.5)were set to conduct a comparative test. The training accuracy, training loss, peak RAM, and test accuracy were regarded as the key evaluation indicators for the models. Training results show that the model with input size of 160×160 and width multiplier of 0.5 yield the best classification performance, which achieved an ideal balance between lightweight and accuracy.
Show more
Read Article PDF
Cite
Research Article Open Access
A CNN-Based Urban Sound Classification Method Using Mel Spectrograms for Noise Pollution Monitoring
Article thumbnail
Sound pollution is an easily overlooked source of pollution in cities, which can have an impact on people's physical and mental health, social relationships, and living environment. At the same time, alleviating urban noise pollution can also improve the ecological environment of cities. In order to alleviate urban noise pollution, this study designed a classification model that can accurately classify four different sounds: bird calls, children's playing sounds, air conditioning outdoor unit noise, and mechanical engine noise. The study used Mel spectrograms, which can convert the sound signals perceived by the human ear into spectrograms. Then, through the CNN convolution model, image features are processed, and different layers can identify different image features. This experiment also used the Edge Impulse platform, which optimized the development process of TinyML without the need to build a pipeline or transplant any models from scratch. The fully automated platform makes research more convenient and efficient. This model has achieved a relatively accurate sound classification task, with an accuracy of 88.4% and a loss of only 0.37.
Show more
Read Article PDF
Cite
Research Article Open Access
Experimental Study on the Effects of Different Noise Types on Visual Gesture Recognition Performance in Complex Backgrounds
Article thumbnail
Gesture recognition has gradually entered the mainstream development direction of natural Human-Computer Interaction and is widely used in smart home devices, Virtual Reality technologies, Assistive systems, Intelligent mobile Devices, etc. Although the recognitions in real-world Visual Environment scenarios are frequently affected by complex background distractions, blurred movements, or missing Images. Based on the development of a test framework for evaluating how different factors affect model robustness using the HaGRID dataset. Three gesture classes, namely fist, like, and palm, are selected for experimentation. Since the original images contain considerable background content and the hand region occupies only a limited area, an automatic hand-cropping procedure is introduced before classification. A transfer-learning model built on MobileNetV2 is then trained, and three noisy test sets are constructed from the same clean cropped test set, corresponding to static clutter, dynamic interference, and missing information. The results show that the proposed preprocessing step improves recognition performance in complex scenes. On the clean test set, the model reaches an accuracy of 79.07% with a weighted F1-score of 0.84. Performance decreases under all noisy conditions. Among them, static clutter produces the largest decline, missing information has a moderate effect, and dynamic interference leads to the smallest reduction. In the class-level analysis, fist shows the strongest robustness, while palm is the most easily affected by noise.
Show more
Read Article PDF
Cite
Research Article Open Access
A Study on a Multi-Classification Model for Chest X-Ray Images Based on Convolutional Neural Networks and Transfer Learning
Article thumbnail
Currently, regarding the diagnosis of pulmonary diseases, especially the Emphysema, Pneumonia, Tuberculosis, it is mainly up to the physicians' judgment. To be specific, they distinguish different types of pulmonary diseases by observing subtle differences in image features on X-ray images. This mainstream diagnostic method is very time consuming and laborious and maybe leads to delayed diagnosis and misdiagnosis. Therefore, it is essential to develop a productive, automated and intelligent method to assist physicians to identify disease category. This study uses CNNs and transfer learning to train a multi classification model which meets the above requirements in Edge Impulse platform. About 14,000 chest X-ray images were used as the training set. This model performed well in independent test set and it reached the level of preliminary application although it still had some limitations in the classification of certain categories. This study provides some guidance for the future development of this field of technology.
Show more
Read Article PDF
Cite
Research Article Open Access
Analysis of Digital Education Ecosystem Based on Large Language Models
Article thumbnail
With the arrival of the new era of Industrial Revolution 4.0, the development trend of artificial intelligence has become unstoppable. Of course, in the field of education, AI technology is also changing the way teachers teach and students learn. It is extending and improving the education ecosystem in a digital way. By 2025, the AI application rate of Chinese primary and secondary school teachers is expected to reach 81% (with a daily usage rate of 26.2%). This paper has a general-specific-general structure and mainly investigates the digital education ecosystem based on chatbots included in large language models, ChatGPT analysis, and reinforcement training from expert systems and personalized learning systems. Through case studies of related examples, it analyzes the breakthroughs they have made, discusses further research, explores future prospects and shortcomings in the new era, and proposes ideas for the innovative development of AI. Finally, it summarizes the digitalization process of the education ecosystem under the influence of large language models.
Show more
Read Article PDF
Cite
Research Article Open Access
From CAP to CALM: Evolution of Consistency Model in Cloud Native Distributed System
Article thumbnail
The core of distributed data correctness assurance is consistency trade-off. Aiming at the limitations of traditional models in cloud native environment, this paper discusses the evolution and application of consistency theory. Firstly, this paper reviews the paradigm shift from the consistency, availability, and partition tolerance (CAP) theorem to the consistency as logical monotonicity (CALM) theorem, and analyzes the key significance of "initiative to avoid coordination" in reducing system delay. Secondly, this paper introduces the fine-grained classification framework of "coordinated spectrum" in the cloud native environment in detail, and quantitatively analyzes the multidimensional trade-off between consistency strength and system throughput in the micro service and edge computing scenarios. Finally, a set of optimization paths based on monotonicity design is proposed to reduce synchronization overhead by identifying the logical monotonicity of application logic. The research shows that monotonic design is a breakthrough to improve the efficiency of large-scale distributed systems, and provides theoretical support and practical guidance for state management under cloud native architecture.
Show more
Read Article PDF
Cite
Research Article Open Access
Deep Learning for PLC-Based Industrial Robots
Article thumbnail
With the deepening of Industry 4.0 and the intelligent transformation of manufacturing, industrial robots are becoming the core equipment of automated production. PLCs have been deeply coupled and integrated with industrial robots due to their high reliability and strong anti-interference ability, and the control performance has been greatly improved. The aim of this paper is to study industrial robotic systems taking PLCs as the main technology, and to study the basic integrated technologies of industrial robots and PLCs, the coupling of PLCs and industrial robots, the technology integration of industrial robots, and integrated technologies. Additionally, I will describe the problems of dependency on the high-end core technology imports, the high integration of industrial robots and PLCs, the high technology integration, the high application, and the integration barriers of the technology, and high application barriers of the integrated technology. This study also provides some suggestions on the integration of industrial robots and PLCs in the future. In the study, I have identified various issues in the field of integrated technologies and the dependence on PLCs, as well as the high application barriers in the study, and have provided some basic references for the technology in the study area. The study focuses on the PLCs, the integrated technologies, and the high level of the industrial development. In addition, the study fills the gap in the study area and provides some references for the high-quality development of the integrated technologies in the PLCs and the small and medium enterprises technologies.
Show more
Read Article PDF
Cite
Research Article Open Access
Comparative Study of Traffic Sign Detection Models Based on YOLOv8n
Article thumbnail
Nowadays, detection of traffic signs is deeply integrated into autonomous driving. This kind of task does not only require accuracy, it also ask for immediate reaction, so that safety can be kept well. For this paper, three representative object detection models are selected—Faster R-CNN, SSDlite, and YOLOv8n—to conduct a comprehensive competitive study. The experiment is carried on a dataset composed of 15 categories of traffic sign images. From the obtained result, Faster R-CNN manages to get the highest mAP@0.5, which is 92.71%, but its inference speed is the lowest one, only 9.51 FPS. Such a slow speed bring a severe limitation when it comes to realtime usage in practice. SSDlite, however, work at a faster speed, still its architecture is somewhat outdated, and this make the mAP@0.5 drop to 66.78%. Different from the above two, YOLOv8n succeed in reaching a balance between precision and velocity. It deliver a robust 89.46% mAP@0.5 and 76.64% mAP@0.5:0.95, and meanwhile the speed achieves 72.52 FPS. Not only that, analysis of the loss curve reveal that YOLOv8n owns more stability during the converging process. Take all these aspects into view, YOLOv8n seems to be the most appropriate one for edge deployment in the field of autonomous driving.
Show more
Read Article PDF
Cite