The review of the referenced manuscript, CL2020-1087, entitled "Improved Network Intrusion Detection Using Hybrid Sequential Forward Feature Selection and Machine Learning", is now complete. I regret to inform you that based on the enclosed reviews and my own reading of your manuscript, I am unable to recommend its publication in IEEE Communications Letters.
Your paper may not be resubmitted for review. The reasons for this are as follows.
I have been able to recruit three expert researchers active in the topic of the manuscript. Although one of them has recommended a minor revision, the other two Reviewers have found the paper unsuitable for publication and highlighted, through their expressed concerns, that several major issues are present in the work, requiring a substantial effort.
Wrap-up of Reviewers¡¯ comments:
• Reviewers #1 and #3 have highlighted a number of instances where the paper readability could be improved and where important clarifications were missing/needed.
• All the Reviewers have complained about an unsatisfactory review of the literature and recommended a clearer contextualization of the present work.
• Reviewers #1 and #2 have highlighted the limited novelty and timeliness of the proposed approach and results (on an outdated IDS dataset).
• Reviewers #1 and #2 have recommended a comparison with other baselines corresponding to alternative feature selection algorithms.
• Reviewers #2 and #3 have requested some clarifications regarding the considered evaluation setup.
I am really sorry that I cannot offer you any better news at this occasion.
The reviewers' comments are found at the end of this email.
Thank you for submitting your work to the IEEE Communications Letters.
Prof. Domenico Ciuonzo
IEEE Communications Letters
Comments to the Author
Authors propose the application of sequential forward feature selection (SFS) to improve the performance of classical machine learning algorithm (evaluated in terms of accuracy and f1-score) by performing a misuse detection task on the NSL-KDD dataset.
Paper is well written but needs further improvement. I recommend the paper for rejection.
1) In Sec. I, 2nd paragraph, I recommend to clarify the explanation of the two major challenges.
2) In Sec. I, 3rd paragraph, the sentence ¡°The machine learning method is intended to learn better feature representations from a vast number of unlabeled data, and then add these learned features to the classification.¡± is misleading. Authors are losing a layer: difference between supervised and unsupervised machine learning approaches. Authors seem to refer to unsupervised machine learning approaches, but the techniques used in the work are supervised.
3) I stress the lack of a Related Works section. The comparison of the proposal with the existing literature on the topic is necessary to enhance the contribution provided by the authors.
4) In Sec. IV B, 5th paragraph, the sentence ¡°Results indicate that the time needed for the system to detect anomaly doing feature selection is faster than the process with feature selection.¡± is misleading.
5) The results are far from novel.
6) As SFS is the contribution, I recommend a comparison with other feature selection algorithms to verify the effectiveness of the proposal.
Comments to the Author
This paper is interesting, it has achieved better performance on baseline classifiers with the help of Sequential Forward feature selection algorithm.
Is the sub-category list of network intrusion an exhaustive list?
Is it possible to mention how this compares with other feature selection algorithms applied to NIDS?
Is there any reason why the training set was chosen at 10%? If the training set percentage is changed, does the performance of the Feature selection algorithm change?
The definitions in Table 1, about the categories of attack, are a little confusing at places. For example, "Through transmitting data packet to that device,
the attacker receives remotely unwanted access to
a computer machine over a network". The word 'unwanted' is ambiguous.
Type at Line 45, Page 2.
Comments to the Author
The NSL-KDD dataset has been described in thousands of works and doesn't need lengthy discussion as Section II. In addition, Tables I, II, and III are well-known, could be easily referenced to other papers than describing here again. Especially when authors DO NOT clarify if they are performing a 2-class or 5-class classification. It seems that this is a 2-class classification ONLY which is anyways expected to give better results. Similarly, both figures 1 and 3, ND Algorithm 1 are totally unnecessary. Instead, authors should focus on the discussion of related work and clarifying why 5-classes have been discussed when their proposed work simply classifies the data in attack and normal categories.
This paper lacks comparison to their results with popular works. E.g., https://dl.acm.org/doi/pdf/10.4108/eai.3-12-2015.2262516 is one of the first works in the area - not cited here. As I mentioned earlier, there are hundreds to other works and comparison with a few other recent works would be better. Another two recent works worthy of comparison: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8264962 and https://arxiv.org/ftp/arxiv/papers/1910/1910.01114.pdf.
The authors are also ambiguous about whether reporter accuracy is for the testing or training dataset. More detailed results need to be provided for a better understanding and evidence that this technique performs better than contemporary work.
Overall, lack of novelty (or at least clarity on novelty), discussion of related work, comparison with existing work, and sparse results need to be drastically improved.