Articles in this Volume

Research Article Open Access
Performance Advantage Comparison of Five Mainstream Optimizers on Datasets with Different Characteristics
Article thumbnail
This paper conducts two comparative experiments. One uses the Software Developer Salary Prediction Dataset, whose target variable is continuous and has a wide value range. The other uses the Red Wine Quality Dataset, whose target variable is quasi-discrete with a narrow value range and whose features are sparse. Five mainstream optimizers are selected for these two experiments: Stochastic Gradient Descent (SGD), Momentum, Adaptive Gradient Algorithm (AdaGrad), Root Mean Square Propagation (RMSProp), and Adaptive Moment Estimation (Adam). To ensure fair experimental implementation, parameters are unified in the experiments. The same weight initialization method is adopted for all optimizers. For datasets, min-max normalization is applied to salary data and Z-score normalization to wine quality data. In addition, a fixed random seed (42) is used throughout. No learning rate scheduling or early stopping is applied during the training of all optimizers, so that their convergence properties can be fully observed. Experimental results are recorded with high precision: two decimal places for salary prediction results and five decimal places for wine quality prediction results to ensure optimal visualization. Experimental results show that adaptive optimizers outperform basic optimizers, but the choice of the best optimizer still depends on the specific task. Finally, guidance is provided for dataset types in practical applications.
Show more
Read Article PDF
Cite
Research Article Open Access
Comparative Analysis of SGD Optimization under Single-GPU and Distributed Training Across Learning Rate Configurations
Article thumbnail
The learning rate is a key hyperparameter for SGD optimization – it directly affects how fast the model converges and how well it eventually performs. As deep learning models get larger, training on a single GPU becomes too slow, and distributed training has become the norm. However, it is hard to fully understand how the learning rate interacts with distributed architectures. This study compares SGD under single‑GPU and DDP‑based distributed training across five learning rates (0.0001, 0.001, 0.01, 0.1, 1.0). This paper used a synthetic binary classification dataset to avoid biases from real data. The global batch size was fixed at 32 in both settings, and this paper recorded training loss over 10 epochs. Our results show that learning rates between 0.01 and 0.1 work best – they converge well and stay stable. Very small rates (0.0001) barely reduce the loss, while a very large rate (1.0) makes the loss curves noisy. Single‑GPU training is slightly more stable than DDP at moderate learning rates, while DDP achieves a marginally lower final loss at the extremes. Interestingly, even a learning rate of 1.0 did not cause divergence – this suggests that the loss landscape of our synthetic problem is relatively smooth. These findings offer practical guidance for people moving from single‑GPU to distributed training: the usable learning rate window becomes narrower, so tuning needs to be more careful.
Show more
Read Article PDF
Cite
Research Article Open Access
A Comparative Study of RNN, LSTM and GRU for Short-Term Traffic Speed Prediction
Article thumbnail
Due to the complex and dynamic time patterns in the actual traffic system, accurate traffic speed prediction is still challenging. This study will focus on using deep learning sequence models to model short-term traffic dynamics. To achieve a fair and controllable comparison, three representative models, Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), are implemented under a unified framework. The sliding window strategy is used to organize the input data into a multivariate time series, where historical observations of 12 time steps are used to predict future traffic speeds. All models share the same architecture design, which is composed of a stacked loop layer and a fully connected layer, ensuring the consistency of model capacity and training conditions. The training process is optimized by using the Adam algorithm with the mean square error as the loss function. Based on the comparative results, the best performing model was further analyzed through a series of controlled experiments. The system changes key hyperparameters, including hidden layer size, network depth and learning rate, to check their impact on prediction performance. This design allows the impact of model capacity and optimization dynamics to be evaluated in isolation. The experimental results show that under the short-term prediction setting, the simpler loop structure can obtain competitive or even superior performance compared with the more complex gating model.
Show more
Read Article PDF
Cite
Research Article Open Access
Optical Biosensors Based on Bound States in the Continuum
Article thumbnail
Bound states in the continuum (BICs) exhibit theoretically infinite quality factors and resonances that have ultra-narrow line widths, offering transformative potential the field of optical biosensing. This review surveys BIC-enabled biosensors and classify them by the type of materials used: metallic, all-dielectric, and metal-dielectric hybrid. This paper fist discusses BICs and how they can be categorized. Symmetry-protected and accidental BICs (such as Fabry-Pérot and Friedrich-Wintgen BICs) are discussed, and how the asymmetry of the structure can be used to explain the sensing performance of the structure using a coupled mode theory framework. Classically defined core performance metrics, i.e., quality factor, sensitivity, full width at half maximum, and figure of merit, are elaborated. Metallic BICs take advantages of plasmonic near field enhancements for low detection limits, but are limited by Ohmic losses. All-dielectric BICs achieve ultrahigh Q factors and topological robustness from Mie resonances, and hybrid systems combine the sensitivity of plasmonics with the high quality factor (Q factor) confinement of dielectrics. A range of state-of-the-art applications range from tumor phenotyping to exosome detection. Current capabilities BIC-enabled optical biosensors and analyze their potential are also assessed. Principal challenges the principal challenges ahead are highlighted, emphasizing on machine learning based designs to assist with BICs, microfluidic on-chip integration, and biosensing with gradients. These suggested technical advancements/ideas would provide the essential features to develop the intelligent biosensing systems of the future.
Show more
Read Article PDF
Cite
Research Article Open Access
Adaptive Gradient-Aligned Aggregation for Federated Learning under Non-IID Data
Article thumbnail
Federated learning (FL) is a distributed training paradigm with privacy-preserving capabilities and has attracted considerable attention in recent years. But the heterogeneity of client statistics still limits its performance. In a non-IID environment, inconsistent local optimization directions can lead to client drift, unstable convergence, and a decrease in the accuracy of the global model. So this paper proposes a multi-factor adaptive aggregation strategy for non-IID scenarios, which comprehensively evaluates clients from three aspects: data volume, consistency of gradient directions, and local training quality, and allocates adaptive aggregation weights. A gradient direction filtering mechanism is also introduced to alleviate the impact of conflicting local updates before global aggregation. Experiments are set based on the MNIST and CIFAR-10 datasets to construct two scenarios, compared with FedAvg, FedProx, and SCAFFOLD as baselines. The results show the method of this paper has stable performance under IID conditions and better performance under non-IID conditions. Subsequent ablation experiments further validate the effectiveness of gradient consistency modeling, training quality weighting, and the update filtering mechanism.The main contribution of this work is to improve convergence stability and enhance the robustness of federated optimization in heterogeneous environments.
Show more
Read Article PDF
Cite
Research Article Open Access
Adaptive Weights Based Ensemble Forecast for Bike Sharing Request Using XGBoost and Prophet
Article thumbnail
In intelligent urban transportation and enterprise market analysis, accurate lightweight bicycle demand forecasting is critical for frontline dispatchers. Complex real-world climatic conditions hinder workers from quick prediction and response. This significantly prolongs the time required for real-time bicycle dispatch. It leads to urban traffic congestion and revenue losses for enterprises. In this case, the research propose a Dynamic Weighted Stacked Ensemble (DWSE), which is composed of the XGBoost method and the Prophet algorithm. Study the use of sliding windows to record the most recent prediction error of each basic predictor. Automatically calculate their respective weights based on the size of the error. The smaller the error of the predictor, the higher the weight it receives. Experiments are conducted over the day-level trip data of Washington Bikeshare service. DWSE achieves better prediction accuracy than conventional methods such as LSTMs and random forests. The model is lightweight with substantially lower parameter complexity than most neural network ensemble schemes, offering an effective and reliable prediction tool for real-time bike-sharing dispatch tasks.
Show more
Read Article PDF
Cite
Research Article Open Access
Optimization of the Feeding Process for a Cast Steel Bucket Tooth Based on Anycasting Simulation
Article thumbnail
Cast steel bucket teeth are susceptible to shrinkage cavity and shrinkage porosity because of wall-thickness variation, local hot spots, and limited feeding paths. A feeding-process optimization route combining modulus analysis and AnyCasting simulation is proposed. The molding material, pouring orientation, parting surface, and semi-closed gating system were first designed. Hot spots and the final solidification region were then identified through modulus analysis and verified by filling, solidification, and defect-prediction simulations. An obround blind riser and external chills were subsequently introduced to improve feeding. Region 7 was identified as the main shrinkage-sensitive zone, with the largest modulus of 1.22 cm. After optimization, filling remained stable, the thermal gradient became more favorable, and shrinkage defects were transferred from the casting body to the riser, leaving no obvious defects in the casting body. The comparison confirms that coordinated riser-chill design is more effective than relying on a single feeding measure. The proposed route improves the reliability of hot-spot identification and feeding-system design, clarifies the effect of coordinated riser-chill design on shrinkage control, and provides an effective process-design method for similar cast steel wear-resistant components.
Show more
Read Article PDF
Cite
Research Article Open Access
Pixel-to-Angle Mapping and Angle-Domain Incremental Control for Vision-Guided Gimbal Target Tracking
Article thumbnail
This paper presents a vision-guided two-axis gimbal tracking method based on pixel-to-angle mapping and angle-domain incremental control. Instead of directly feeding image-plane pixel errors into a conventional pixel-domain controller, the proposed pipeline first converts the target deviation from the image center into physically meaningful yaw and pitch angle errors using camera calibration parameters. These angle errors are then used to generate angular-velocity commands and motor RPM references through an angle-domain control chain. To improve practical deployability under sensing noise, abrupt target displacement, and actuator limitations, dead-zone logic, command saturation, and low-pass filtering are incorporated into the control loop. The paper further organizes the method into a complete conference-style presentation including system overview, imaging geometry, control architecture, and evaluation protocol. The resulting formulation provides a physically consistent interface between visual measurements and gimbal actuation, and offers a more interpretable basis for future extensions with encoders, IMU feedback, and more advanced front-end detectors and trackers.
Show more
Read Article PDF
Cite
Research Article Open Access
Regression Prediction of Industrial Steam Boiler Thermal Efficiency Based on Machine Learning Algorithms
Article thumbnail
As the core equipment of energy power system, industrial steam boiler is difficult to achieve accurate thermal efficiency prediction by traditional mechanism modeling and counter-equilibrium calculation under deep peak shaving and complex working conditions. Single machine learning and deep learning models have problems such as insufficient timing capture, parameter dependence on manual debugging, and weak generalization ability. In this paper, a DE-Transformer-BiLSTM regression prediction algorithm is proposed, which combines the advantages of differential evolution optimization, Transformer global feature extraction and BiLSTM bidirectional timing modeling to automatically optimize key parameters and take into account both global correlation and local timing features. The experiment adopts 1283 effective monitoring data from industrial site, sets 10 input characteristics and boiler thermal efficiency as output variables, and carries out comparative experiments after data preprocessing and correlation analysis. The results show that the model has MSE of 0.142, RMSE of 0.377, MAE of 0.286, MAPE of 0.308 and R² of 0.93. All indexes are better than many traditional models, which can effectively improve the prediction accuracy and generalization ability of boiler thermal efficiency under complex working conditions, and provide a new method for energy efficiency prediction of active systems.
Show more
Read Article PDF
Cite