Cloudy with a Chance of Breach: Forecasting Cyber Security Incidents

Link to the paper: https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-liu.pdf

Summary:

This study continued on the trend of papers looking to branch out of the normal cyber research and focus on the prediction of cyber events. Specifically in this case the study looked at 258 externally measurable features that made up a security posturing profile. From there the study made a model that tried predicting future cyberattacks using this model. One of the major flaws of this study though was the data they were using - in the end they only had enough data to test and train one event type web applications. Furthermore the study had a higher false positive rate than other methods of prediction like RiskTeller which has been proposed in other papers.

 

What I liked:

  1. The study encompasses 258 externally measurable features meaning that there are a large amount of observation data the model can use

  2. I really like the anaology to the patient in prediction vs detection - but I think there is a distinction that needs to be made. Prediction is a lot more valuable than detection. If a patient is sick and a doctor detects the sickness he or she can give medicine to make the problem go away - while in the case of cyber by the time you detect a problem the damage could be already done. So being able to predict where cyber attacks occur and being able to shore up defenses in a cost effective matter is a lot more needed than detection

  3. Large amount of hacks from different event databases creates a diversity for the model to learn from

  4. The study does a good job of weeding out attacks from their data that had nothing to do with security posturing ie internal attacks

  5. Testing training data was done chronologically meaning the testing seemed more real life

 

What I didn’t like:

  1. The study reports a 90% true positive rate, and a 10% false positive rate which is less effective than a lot of the other papers - specifically the Risk Teller paper had a 95% true positive

  2. I don’t think the study defines how they evaluate what counts as a malicious activity in their security posture data which kind of begs the question of what exactly they are predicting

  3. I don’t like the fact the study uses a collection of datasets that are off by a couple of months it seems like the data is disjointed and might not paint the right picture

  4. One of the major issues with this study is that they claim to offer a snapshot of security that doesn't change to much month to month as compared to day to day snapshots that are in other studies like RiskTeller. I think this snapshot is even flawed because the data doesn't overlap in time meaning that the snapshot might not even be clear.

  5. The study says the only incident they had enough data to test and train is the system for web app incidents - which means they really didn't come up with much with their data

Points to talk about:

  1. This study has a higher false positive rate than the RiskTeller study - what methods from that study made it more effective in reducing false positives

  2. Do cyberattack types vary from country to country or are they globalized ie the same across countries?

  3. Are hosting companies like Go Daddy following the best cyber practices - the study omits web hoster's name's from the study to prevent biasing their model. But it begs the question why their names show up so many times in the attack details.

  4. Why are attacks from the WHID Database detected less often than other attacks in Figure 6?

  5. Does size of the network increase or decrease the risk of an attack?

 New Ideas:

  1. Look into creating a study that uses all 3 of their datasets: mismanagement symptoms, malicious activities, and incident reports at the same time instead of staggered like this study

  2. How would this study change if we included things like internal attacks which were left out in this study. I'm sure there are posturing techniques that restrict the ability for an internal attacker to really hurt the company?

  3. How would the model change if the study kept the hosting information?

  4. Does relying on multiple databases make this system more reliable in prediction? Can we create novel testing data and compare outcomes with other prediction methods such as Riskteller that only use one source of data?

  5. Reconduct this study with a bigger data set so they can have valid models for different event types