Volume | Applied and Computational Engineering

Research Article Open Access

Published 15 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0054

Advancements in Fingerprint Recognition Through Deep Learning: A Comprehensive Analysis of Novel Algorithms

Leqian Yang

This paper presents a comprehensive exploration of the application of deep learning technology in fingerprint recognition, focusing on the development and impact of several novel algorithms. These algorithms have been specifically designed to address challenges in key sub-fields such as pose estimation, direction field estimation, minutiae extraction, and minutiae matching. By integrating deep learning techniques, these new approaches significantly enhance the accuracy, stability, and efficiency of fingerprint recognition systems. The study demonstrates that these algorithms surpass traditional methods in several critical areas, offering improved precision in recognizing fingerprints, particularly in high-noise environments. Furthermore, the fully differentiable nature of these models contributes to their robustness, enabling more consistent and reliable performance across diverse scenarios. The results underscore the potential for these deep learning-based algorithms to set new benchmarks in the field, with broad implications for their application in security, law enforcement, and other areas requiring reliable biometric authentication. The use of these cutting-edge methods is anticipated to be vital in influencing the direction of fingerprint recognition technology as it develops further, guaranteeing increased security and precision.

Read Article PDF

Cite

Research Article Open Access

Published 15 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0055

A Comparative Analysis of StackGAN and AttnGAN in Text-to-Image Generation

Runguo Wang

This research looks at text-to-image generation as a whole, comparing two popular models—Stacked Generative Adversarial Networks (StackGAN) and Attentional Generative Adversarial Networks (AttnGAN)—and their respective strengths and weaknesses. Text-to-image generation has seen significant advancements with the introduction of GAN-based models, and this paper aims to explore how these models perform in terms of image quality, realism, and alignment with textual descriptions. Using the Caltech-UCSD Birds (CUB)-200-2011 dataset, which consists of bird images, extensive experiments were conducted to evaluate and compare the capabilities of the two models. The results indicate that AttnGAN outperforms StackGAN across multiple metrics, particularly in the accuracy of detail alignment and overall image realism. AttnGAN's multi-level attention mechanism allows it to pay attention to specific textual elements when generating related sections of the image, resulting in more aesthetically pleasing and semantically consistent outputs. Despite these advancements, challenges remain in improving both the diversity and quality of generated images. This work offers substantial insights into the capabilities and constraints of existing models, providing guidance for future research with the aim of improving text-to-image generation.

Read Article PDF

Cite

Research Article Open Access

Published 15 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0053

Enhancing Text-to-Image Generation: Integrating CLIP and Diffusion Models for Improved Visual Accuracy and Semantic Consistency

Ziyang Wang

Text-to-Image (T2I) generation focuses on producing images that precisely match given textual descriptions by combining techniques from computer vision and natural language processing (NLP). Existing studies have shown an innovative approach to enhance T2I generation by integrating Contrastive Language-Image Pretraining (CLIP) embeddings with a Diffusion Model (DM). The method involves initially extracting rich and meaningful text embeddings using CLIP, which are then transformed into corresponding images. These images are progressively refined through an iterative denoising process enabled by diffusion models. Comprehensive experiments conducted on the MS-COCO dataset validate the proposed method, demonstrating significant improvements in image fidelity and the alignment between text and images. When compared to traditional models like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), which often struggle with maintaining both visual quality and semantic accuracy, this hybrid model shows superior performance. Future research could explore optimizing hybrid models further and applying T2I technology to specialized fields, such as medical imaging and scientific visualization, expanding its potential use cases.

Read Article PDF

Cite

Research Article Open Access

Published 15 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0056

Comparative Evaluation of Sentiment Analysis Methods: From Traditional Techniques to Advanced Deep Learning Models

Fuhai Wang

Sentiment evaluation plays a crucial role in deciphering public perception and consumer responses in today's digital landscape. This investigation offers a thorough assessment of diverse sentiment evaluation techniques, contrasting conventional machine learning methodologies with cutting-edge deep learning frameworks. In particular, the research scrutinizes the efficacy of Bidirectional Encoder Representations from Transformers (BERT)-derived architectures (BERT-Base and Robustly Optimized BERT Pretraining Approach (RoBERTa)), Convolutional Neural Networks (CNN), Long Short-Term Memory Networks (LSTM), Support Vector Machines (SVM), and Naive Bayes classifiers. The study gauges these approaches based on their precision, recall, F1-metric, overall accuracy, and computational efficiency using an extensive sentiment evaluation dataset. The results reveal that BERT-based models, particularly RoBERTa, achieve the highest accuracy (87.44%) and F1-score (0.8746), though they also require the longest training time (approximately 3 hours). CNN and LSTM models strike a balance between performance and efficiency, while traditional methods like SVM and Naive Bayes offer faster training and deployment with moderate accuracy. The insights gained from this study are valuable for both researchers and practitioners, highlighting the trade-offs between model performance, computational demands, and practical deployment considerations in sentiment analysis applications.

Read Article PDF

Cite

Research Article Open Access

Published 15 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0057

Mitigating Bias in Large Language Models: A Multi-Task Training Approach Using BERT

Siru Chen

Large language models (LLMs), such as ChatGPT, have become essential tools due to their advanced natural language processing capabilities. However, these models, trained on extensive internet text, can inadvertently learn and propagate unwanted biases, impacting their outputs. This study addresses this issue by analyzing and mitigating such biases through a multi-task and multi-stage training approach. Utilizing the Winograd Bias (Winobias) dataset, the research fine-tunes the Bidirectional Encoder Representations from Transformers (BERT) model to reduce biased outputs. The approach includes an initial mask task to establish a general understanding and a subsequent cloze task to specifically target and mitigate biases. Results demonstrate a significant reduction in bias, with the original model showing approximately 90% certainty in biased outputs, whereas the de-biased model reduced this certainty to 55%. This study effectively showcases a method for bias reduction by modifying only a few parameters, emphasizing a practical approach to enhancing fairness and balance in LLMs used across various applications.

Read Article PDF

Cite

Research Article Open Access

Published 26 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0059

Optimizing Top-K query processing on GPU using data compression and pre-filtering

Yiqing Yang, Yan Fu, Shuyong Liu, Zhixiang Zhao, Guoyin Zhang

In large-scale data analysis, efficient Top-K query processing is critical for numerous applications in science, industry, and society. Traditional approaches often involve substantial data transfer and computational overhead, making it difficult to meet the scalability and efficiency demands of modern datasets. This paper proposes a GPU-accelerated Top-K query processing method that integrates data compression and pre-filtering techniques to address these challenges. By partitioning and compressing data on the host side, it alleviates common PCIe bottlenecks in heterogeneous computing environments. A metadata-driven pre-filtering technique further reduces the data volume processed on the GPU, significantly improving query performance, particularly when handling anti-correlated datasets. Experimental results demonstrate that this method markedly reduces data transfer and processing time, confirming its effectiveness in enhancing the efficiency and scalability of Top-K query processing compared to existing methods.

Read Article PDF

Cite

Research Article Open Access

Published 26 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0060

Exploring the Effects of Attention Mechanisms on Road Crack Detection and Classification

Xingye Peng

Pavement cracks are a prevalent form of roadway distress that pose significant safety hazards, necessitating prompt detection and repair. Due to the extensive road network, traditional image processing-based crack detection methods exhibit limitations in recognition accuracy and speed. To address these challenges, this paper proposes an efficient pavement crack detection model based on YOLOv8, termed YOLOv8-ACD (YOLOv8 - Attention for Cracks Detecting), which integrates a global attention mechanism. YOLOv8-ACD enhances detection efficiency and accuracy by focusing on crack identification while filtering out most irrelevant information. We evaluated YOLOv8-ACD on the RDD2022 dataset, and experimental results demonstrate significant improvements in key performance metrics, such as F1-Score and mean Average Precision (mAP), compared to the original YOLOv8 and other mainstream models. The real-time processing capability of this model makes it suitable for practical road inspection and maintenance, effectively reducing the workload of maintenance personnel and enhancing roadway safety.

Read Article PDF

Cite

Research Article Open Access

Published 26 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0061

Applications of Human-Computer Interaction in the Field of Cultural Heritage Preservation

Benxuan Wang

The development of digital technologies has opened up some possibilities for practices across various fields, while the discipline of human-computer interaction has introduced new paradigms that enable humans to engage with digital technologies more conveniently. An increasing number of disciplines are seeking to intersect with the field of human-computer interaction, continuously advancing technology and exploring new research areas. This paper aims to assess the current applications of human-computer interaction technologies in the field of cultural heritage preservation and to provide guidance for potential future developments. To this end, this paper conducts a literature review to analyze and discuss the research progress in relevant academic journals. This paper specifically discusses three mainstream applications of human-computer interaction in the field of cultural heritage preservation: heritage information collection and preservation, restoration and display, and gamification. It analyzes the current research progress and key case studies for each application. Studies have shown that the introduction of human-computer interaction technologies in cultural heritage preservation primarily differs from traditional research in terms of public participation. The project cases documented in the literature mainly focus on conceptual designs. The conclusion of the article provides a summary of the entire text and offers an outlook for future developments.

Read Article PDF

Cite

Research Article Open Access

Published 26 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0068

Exploring Feature Detection: A Comparative Study of Classical and Deep Learning Methods Across Complex Scenes

Huichuan Zhou

With the swift advancements in computing technology and artificial intelligence, the field of image processing has undergone profound changes. As the key link of information extraction, image features play a central role in computer vision tasks. Conventional methods like SIFT and SURF are often utilized in vision problems owing to their robustness and invariance. In contrast, deep learning algorithms, such as Superpoint and D2-Net, show greater adaptability and robustness in complex environments. In this study, we comprehensively evaluate the performance of the classical algorithms SIFT and SURF, alongside the deep learning methods Superpoint and D2-Net, across various scenarios, including repetitive patterns, cluttered backgrounds, and strong illumination conditions The experimental results show that SIFT and SURF perform stably when dealing with simple environments, while Superpoint and D2-Net demonstrate stronger adaptability and robustness in complex scenes, especially in terms of matching efficiency, average matching distance and consistency of feature distribution. Through comprehensive analysis and experimental verification, this paper reveals the effectiveness and limitations of the algorithms in different environments, providing a scientific basis for the selection of algorithms in practical computer vision tasks.

Read Article PDF

Cite

Research Article Open Access

Published 26 November 2024 DOI: 10.54254/2755-2721/105/2024TJ0069

Enhancing Image Stitching Algorithms with SIFT Feature Detection

Mingkai Wang

Image stitching technology plays a significant role in the fields of computer vision and image processing, with applications ranging from panoramic photography to virtual reality (VR), augmented reality (AR), medical diagnostics, and autonomous vehicle technology. As technology advances, the demand for high-quality, real-time panoramic images provided by image stitching technology continues to grow. This study aims to implement an image stitching method based on the Scale-Invariant Feature Transform (SIFT) feature point detection algorithm, combined with the Random Sampling Consensus (RANSAC) algorithm and the calculation of the homography matrix to automatically stitch two images. This paper elaborates on the entire process of image feature extraction, feature matching, homography matrix calculation, and image fusion, and compares different fusion modes. The experimental results show that the method can achieve seamless image stitching in some cases, its performance in complex scenes such as crowds or traffic flows is average. This study provides new perspectives and methods for the application of image stitching technology.

Read Article PDF

Cite

Articles in this Volume