“Improving malware detection accuracy through hyperparameter tuning with parallel processing “(Undergraduate student’s research for the 2023 academic year)

In this study, we improved the efficiency of model development and enhanced malware detection accuracy by performing parallel hyperparameter tuning for machine learning models.

Traditional antivirus detection methods primarily rely on pattern matching, in which files are compared against previously known malicious code. However, this approach faces limitations in responding to the large number of new malware variants that appear daily. To address this issue, we constructed a malware classification model using a public dataset containing information on more than 50,000 files, including malicious samples. Furthermore, by using Optuna—a library that automatically searches for optimal hyperparameters—we achieved efficient optimization through parallel processing.

As a result, our model achieved an accuracy of 99.35%, surpassing the 98.00% reported in previous studies. Validation on an additional dataset also demonstrated high accuracy in the 97%–99% range, confirming that this approach provides a robust and generalizable methodology capable of handling unknown malware.

In summary, our findings indicate that efficient hyperparameter optimization using parallel processing contributes significantly to improving the accuracy of malware detection.

Flowchart of the implemented system