Volume 52 | Applied and Computational Engineering

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241111

Investigation of generative capacity related to DCGANs across varied discriminator architectures and parameter counts: A comparative study

Fan Yi

Generating lifelike images through generative models poses a significant challenge, where Generative Adversarial Networks (GANs), particularly Deep Convolutional GANs (DCGANs), are commonly employed for image synthesis. This study focuses on altering the DCGAN discriminator’s structure and parameter count, investigating their effects on the characteristics of the resulting generated images. Assessment of these models is carried out using the Fréchet Inception Distance (FID) score, a metric that gauges the quality of generated image samples. The research specifically involves substituting some convolutional layers with fully-connected layers, and the ensuing outcomes are thoroughly compared to discern the impact of these structural changes. Furthermore, dropout was used to study the number of the parameters’ influence. This study compared the FID score of the models when the probability is 0, 0.2, 0.4, 0.6 and 0.8. Experimental results showed that the DCGAN with the fully-connected layers’ generated ability was stronger than the original one. Besides, when the probability of the dropout is 0.6, the images generated was the most realistic. Finally, the paper explained the possible reasons for the difference and proposed a better generative model based on DCGAN.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241115

Data augmentation-based enhanced fingerprint recognition using deep convolutional generative adversarial network and diffusion models

Yukai Liu

The progress of fingerprint recognition applications encounters substantial hurdles due to privacy and security concerns, leading to limited fingerprint data availability and stringent data quality requirements. This article endeavors to tackle the challenges of data scarcity and data quality in fingerprint recognition by implementing data augmentation techniques. Specifically, this research employed two state-of-the-art generative models in the domain of deep learning, namely Deep Convolutional Generative Adversarial Network (DCGAN) and the Diffusion model, for fingerprint data augmentation. Generative Adversarial Network (GAN), as a popular generative model, effectively captures the features of sample images and learns the diversity of the sample images, thereby generating realistic and diverse images. DCGAN, as a variant model of traditional GAN, inherits the advantages of GAN while alleviating issues such as blurry images and mode collapse, resulting in improved performance. On the other hand, Diffusion, as one of the most popular generative models in recent years, exhibits outstanding image generation capabilities and surpasses traditional GAN in some image generation tasks. The experimental results demonstrate that both DCGAN and Diffusion can generate clear, high-quality fingerprint images, fulfilling the requirements of fingerprint data augmentation. Furthermore, through the comparison between DCGAN and Diffusion, it is concluded that the quality of fingerprint images generated by DCGAN is superior to the results of Diffusion, and DCGAN exhibits higher efficiency in both training and generating images compared to Diffusion.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241136

Exploring the potential of federated learning for diffusion model: Training and fine-tuning

Shuo Chen

Diffusion models, a state-of-the-art generative model, have drawn attention for their capacity to produce high-quality, divers, and flexible content. However, the training of these models typically necessitates large datasets, a task that can be hindered by challenges related to privacy concerns and data distribution constraints. Due to the amount of data and hardware required for large model training, all centralized training will be done by large companies or labs with computing power. Federated Learning provides a decentralized method that allows for model training across several data sources while maintaining the data's localization, reducing privacy threats. This research proposes and evaluate a novel approach for utilizing Federated Learning in the context of diffusion models. This paper investigates the feasibility of training and fine-tuning diffusion models in a federated setting, considering various data distributions and privacy constraints. This study used the Federated Averaging (FedAvg) technique to train the unconditional diffusion model as well as to fine-tune the pre-trained diffusion mode. The experimental results demonstrate that federated training of diffusion models can achieve comparable performance to centralized training methods while preserving data locality. Additionally, Federated Learning can be effectively applied to fine-tune pre-trained diffusion model, enabling adaptation to specific tasks without exposing sensitive data. Overall, this work demonstrates Federated Learning's potential as a useful tool for training and fine-tuning diffusion models in a privacy-preserving manner.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241138

The application of federated learning in face recognition: A systematic investigation of the existing frameworks

Chuanzhi Xu

This paper presents a thorough examination of the recent progress made in applying federated learning to the field of face recognition. As face recognition technology continues to gain widespread adoption across various sectors, issues related to data privacy and efficiency have taken center stage. In response, federated learning, characterized by its decentralized machine learning approach, has emerged as a promising solution to tackle these pressing concerns. This review categorises the current federated learning frameworks for face recognition into four main purposes: Training Efficiency, Recognition Accuracy, Data Privacy, and Spoof Attack Detection. Each category is explored in-depth, highlighting the principles, structures, applicability, and advantages of the frameworks. The paper also delves into the challenges faced in the integration of federated learning and face recognition, such as high computational overhead, model inconsistency, and data heterogeneity. The review concludes with recommendations for future research directions, emphasising the need for model compression, asynchronous communication strategies, and techniques to address data heterogeneity. The findings underscore the potential and challenges of applying federated learning in face recognition, paving the way for more secure and efficient facial recognition systems.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241204

Exploring the potential of data augmentation in poetry generation with small-scale corpora

Renxiang Huang

Poetry generation is a complex task in the field of natural language processing, especially when working with small datasets. Data augmentation techniques have been shown to be an effective way to improve the performance of deep learning models in various tasks, including image classification and speech recognition. Therefore, this study focuses on exploring the impact of four different data augmentation methods - Synonym Replacement, Random Insertion, Random Swap, and Random Deletion - on the performance of poetry generation with a small poetry dataset. The results of the study reveal that Random Insertion performed well in terms of Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation (ROUGE), and manual evaluation when compared to other data augmentation techniques. Synonym Replacement performed poorly in all three evaluations. This study confirms the potential value of data augmentation technology in poetry generation tasks and provides innovative perspectives and directions for future research in this area. Data augmentation can be employed to help address the problem of limited data in poetry generation tasks and enhance the efficiency of deep learning models. Future research could focus on exploring more advanced data augmentation techniques and their impact on poetry generation tasks.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241228

Designing a bias-rating news recommendation system

Siqian Liu

Media bias can significantly influence public perception, often subconsciously shaping opinions. To understand and measure this bias, diverse methodologies have emerged. While models from social sciences offer in-depth evaluations, they involve intensive manual analysis. In contrast, computerized models provide speed but often lack depth. This research explores the synergy between these disciplines, aiming to create a robust bias detection tool that combines the meticulousness of social science models with the automation of computer science. Using this interdisciplinary approach, a system was developed to evaluate articles and instantly present a 'bias score' on the user interface. This score offers readers an immediate indication of potential news slant. The research also integrated web crawling techniques into the system, allowing it to identify and recommend alternative articles on analogous subjects. This innovative feature enriches readers' choices, equipping them with multiple narratives for an enriched understanding. In conclusion, this work bridges the gap between depth and speed in media bias detection, offering a novel tool that promotes informed readership. The contribution of this study lies in its interdisciplinary approach and the development of a system that fosters holistic media consumption.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241231

DeBERTa with hats makes Automated Essay Scoring system better

Shixiao Wang

Automated Essay Scoring (AES) is a rapidly growing field that applies natural language processing (NLP) and machine learning techniques to the analysis and evaluation of academic essays. By automating the process of evaluating essay quality, AES not only greatly reduces the workload of human graders but also ensures consistency and objectivity in the evaluation process. AES systems can evaluate essays based on multiple criteria, including organization, coherence, and content. With the advent of deep learning, AES has shown significant improvements in accuracy and reliability. AES systems have numerous applications in education, particularly in large-scale assessment and feedback loops. In this article, we delve into the use of an improved Bidirectional Encoder Representations from Transformers (BERT) architecture with disentangled attention mechanism known as DeBERTa for student question-based summarization. This is one of the downstream tasks within AES, which is of great significance for student learning assessment. The organic combination of DeBERTa-v3 and diverse hats like Light Gradient Boosting Machine (LGBM) algorithm and Extreme Gradient Boosting algorithm (XGBoost) has proven to be highly effective in achieving excellent results in this task, indicating their significant potential in real-world AES systems.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241232

An analysis of BERT-based model for Berkshire stock performance prediction using Warren Buffet's letters

Geyang Yu

The objective of this study is to discover and validate eﬀective Bidirectional Encoder Representations from Transformers (BERT)-based models for stock market prediction of Berkshire Hathaway. The stock market is full of uncertainty and dynamism and its prediction has always been a critical challenge in the ﬁnancial domain. Therefore, accurate predictions of market trends are important for making investment decisions and risk management. The primary approach involves sentiment analysis of reviews on market performance. This work selects Warren Buﬀett’s annual letters to investors and the year-by-year stock market performance of the Berkshire Hathway as the dataset. This work leverages three BERT-based models which are BERT-Gated Recurrent Units (BERT-GRU) model, BERT-Long short-term memory (BERT-LSTM) model, and BERT-Multi-Head Attention model to analyse the Buﬀett’s annual letters and predict the Berkshire Hathway’s stock price changes. After conducting experiments, it could be concluded that all three models have a certain degree of predictive capability, with the BERT-Multi-Head Attention model demonstrating the best predictive performance.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241234

Exploration, detection, and mitigation: Unveiling gender bias in NLP

Chunxiao Zhang

Natural Language Processing (NLP) systems have a mundane impact, yet they harbour either obvious or potential gender bias. The automation of decision-making in NLP models even exacerbates unfair treatment. In recent years, researchers have started to notice this issue and have made some approaches to detect and mitigate these biases, yet no consensus on the approaches exists. This paper discusses the interdisciplinary field of linguistics and computer sciences by presenting the most common gender bias categories and breaking them down with ethical and artificial intelligence approaches. Specific methods for detecting and minimizing bias are shown around biases present in raw data, annotator, model, and the linguistic gender system. In this paper, an overview of the hotspots and future perspectives of this research topic is presented. Limitations of some detection methods are pinpointed, providing novel insights into future research.

Read Article PDF

Cite

Research Article Open Access

Published 27 March 2024 DOI: 10.54254/2755-2721/52/20241252

Unleashing the power of Convolutional Neural Networks in license plate recognition and beyond

Maolin Wang

This article explores the application of Convolutional Neural Networks (CNN) in the field of license plate recognition. It begins by introducing the architecture of CNN, which consists of three key layers: Convolutional Layers, Pooling Layers, and Fully Connected Layers. The article then references three relevant papers that demonstrate how CNNs are applied in license plate recognition. The first paper utilizes TensorFlow to construct a CNN model and integrates it with an STM32MP157 embedded chip for license plate recognition. The second paper presents a real-time car license plate detection and recognition method called Multi-Task Light CNN, emphasizing robustness. The third paper employs the ResNet+FPN feature extraction network of the Mask R-CNN model and annotates a license plate dataset. The article highlights the promising future of CNNs in various fields beyond license plate recognition, emphasizing their potential for further development and industrial applications. CNNs have proven to be versatile and powerful tools in computer vision, offering solutions to a wide range of problems. Their adaptability and effectiveness make them a key player in the ongoing advancement of artificial intelligence and automation technologies.

Read Article PDF

Cite

Articles in this Volume