Ramblings on risk starting with John Pescatore and ending with comments from FAIR risk framework creator Jack Jones.
This is a direct transcription of a discussion about risk management on LinkedIn started by Infospectives owner Sarah Clarke. For those who can access the online discussion it’s here. If not, do read on. It is now 8 months old, but just today a similar discussion with similar responses kicked off on Twitter, thanks to a nicely provocative tweet from Andrew Barratt
In both exchanges there are more questions than answers (and some of the best minds in the trade chipped in), but that is the nature of risk management in the IT and InfoSec trade at the moment. No consensus, no easy answer.
Where explicit permission to share hasn’t been gained, names have been left out.
Where are you now?
Immovable IS risks on risk profiles? Difficultly comparing risks and prioritizing remediation? Eventual exec apathy about infosec risk management until there is an incident? Sound familiar? John Pescatore’s SANS blog post calls this out and proposes a better way.
Beautiful! Cogent, concise, illuminating and persuasive. Skewers the logical flaws in applying the “standard approach” to IT — and to a lot of other types of risks where probability and magnitude are wild guesses.
Risk Management Expert
Well-written material. Too bad it does not use the correct definition of risk, e.g. it completely disregards opportunities and focuses on threats only.
Old school, needs some ISO31000 IV 🙂
Security Governance, Risk & Compliance Specialist
Fair point, but to my mind the biggest challenge is in splicing the operational level view of risks, issues and incidents (e.g. missing patch, small data disclosure, lost device, website vulnerability, legacy kit related risk, potential access loophole) into the uber high level one favoured for corporate consumption. Top down and bottom up risk assessments can miss each other going in opposite directions.
How are you scaling, aggregating and grouping infosec risks? Have you got a level playing field to compare your infosec risks to traditional commerical risks so remedial spend can be prioritised? I’m expecting there to be answers, but I’m not expecting them to have much in common with each other.
Risk Management Expert
Sarah, thanks for your reply.
All infosec risks translate into a change in the balance sheet, otherwise for-profit companies would have no real reason to invest money in addressing them.
The real issue is how to calculate the impact (there are many ways) and how to reduce the threats (never ever reduce the risk: no risk=no business)) while enhancing the opportunities.
Executive Advisor – A Risk Leadership Organisation
A couple of thoughts…..
1. if the traditional approach of probability x impact fails (or misleads), because both elements are so hard to estimate, what is the alternate to help management rank risks / threats to decide where to focus the inevitably limited resources to best affect
2. We also take velocity into account ie how quickly will the estimated impact actually occur after the event takes place – for some incidents you have more time to react and for others much less time to respond – and that’s an important factor to estimate adn to take into account.
Like the velocity point. On the question of monetizing risks…goes without saying. We’re still left with the practical question of how. As John says we can’t bleed the value out of the risk management effort by spending 100s of man hours drilling down to potential impact and converting to monetary bases. A real problem as infosec risks often result from multiple threats/vulnerabilities and have potential to impact many business areas. Business areas who often need lots of support to understand why the risk is a risk and who, like the rest of the industry, struggle to place a cost on brand damage and poor publicity.
Research Director with Leading IT & Security Solutions Review Company
That’s an unfortunate blog post, because it belies a failure to even remotely understand the basis of quantitative risk analysis. Quant risk is NOT defined as “probability x impact” – not even close. Nor is the “simpler” version he cites correct (it’s typically been threat x vuln x impact – no idea what this “action” nonsense denotes).
For a good basis on quant methods, FAIR is now freely available through The Open Group:http://www.opengroup.org/certifications/openfair. I also highly recommend Hubbard’s “How To Measure Anything” for a background on estimation and calibration.
We need to stop perpetuating this antiquated, uneducated nonsense.
FAIR isn’t quantitative risk assessment. It’s a solid extra layer of logic and analysis over and above the rest of the qualitative “antiquated, uneducated nonsense”. There’s more I want to read, but would be good to hear from folk who’ve rolled this out and used it in anger for a while, specifically in the IT/IS risk space.
Research Director with Leading IT & Security Solutions Review Company
FAIR absolutely is a quantitative risk analysis methodology. It plugs into your risk assessment process. CXOWARE also indicates that they’ve developed an approach for aggregating these risk estimates accordingly (within their SaaS tools). I recommend talking to them for more information about all of that.
Alternatively, one can use OpenFAIR as a reference basis for then building one’s own quantitative approach. The same challenges with aggregation will apply.
I was being, perhaps unhelpfully, facetious. Unless dealing with homogenous assets with well defined value, risk estimates are all going to be subjective before you drill too far down away from the figure they spit out – one of John’s bugbears. If it’s done consistently enough to allow comparison between operational and financial risks without killing the staff doing it, I’m all for it.
Information Security and GRC Auditor
There are a whole slew of issues in all this, the wost of which, to my mind, is rying to reduce a multi-dimensional issue to a single cv
At best we might take is a “Sum over all combinations divided by the number of them”, where the combinations are derived on the basis of the threat & its probability, how the threat might exploit any one of a number of vulnerability and to what extent, which assets each of those will affect, the value of assets, the degree to which each of those paths will affect the asset, and of course some way of factoring in any existing controls.
“Bleedin’ obvious ain’t it?”
Well, yes, but its also rather more than most people want to handle.
And of course much of it is information that is unavailable or guesswork.
So we come up with simplifications like that in the article. But those simplifications make assumptions that are not only not stated (aka hidden) but may not apply. homogenous assets is just one of them. That all exploits have the same effect is another.
Many of these simplifications come from an era when we didn’t have computers to do detailed modeling (aka crunching large matrices), information sources that told us what vulnerability existed and were being uncovered. And of course that era was dominated by the accountant’s view of assets — book value and depreciation (or appreciation) or purchased material — that ignores such things as data, regulatory compliance, human skills and more, things that can be at risk and ‘impacted’ in a deleterious manner.
IT Governance and Controls Consultant
The bigger issue here is that by focusing on financial metrics, and therefore Financial Risk, IT may downplay or ignore other Business Risk considerations. For example, benchmarking your compliance status versus peer group can be a powerful influence tool. It has Standard of Care (legal risk element) written all over it and takes advantage of how others in your industry perceive risk. Or, even more important, the real risk from the business point of view may be the impact on product time to market (going concern risk).
Bottom line, better to get in touch with your business owner’s risk perspective (market, reputation, financial, legal, …) and do the homework needed to evaluate how IT exposures / threats stack up
Information Security and GRC Auditor
There’s an example of that back in the SOX stuff from the PCAOB.
SOX was there to quite from the register, “to restore investor confidence in the American financial system” …. after Enron and all that.
Well, OK, the loosening fo the controls later that led to another crash … we’ll pass over that. But in PCAOB Advisory #4, I think it was, there was a notice that DR/backup is not part of the audit/requirements. I don’t have it to hand so I’m not sure of the exact wording, but it caused an outrage. Other country’s SOX-equivalents are smarter and do include DR/Backup.
The point here is this: Would you invest in a company that didn’t do backups, that didn’t have a DR/BC Plan?
The example isn’t exact, but in reality most companies now are dependent on IT; that’s where they keep books & financial records, communications and all that. Maybe even access controls such as who can take an elevator to which floor.
OBTW: Read “Risks Digest” for some ideas.
Having run the SOx system testing effort for one part of the business and a supplier security governance service I can’t resist jumping in here. True there’s great confidence to be had from a relatively clean bill of health compliance-wise, or at least evidencing robust treatment of discovered non-compliance for the first couple of turns of the handle. The challenge is that all compliance benchmarks are not created equal and in order to run a maintainable, repeatable, credible compliance service it HAS to incorporate risk.
Let’s leave the question of controls scope on one side. People get bored of me talking about the ISO Certificate that looked great and covered their whole company, but ONLY for physical security controls.
Risk assessments have to be done to understand the potential risk linked to one type of deficiency compared to another e.g. if you’ve got comprehensive joiners, movers leavers processes there’s less risk if you don’t periodically revalidate all access. You also have to risk assess the systems in question to decide scope (who can resource to assess everything) and to splice the overall system risk to the type of deficiencies found. Then there’s the platforms underpinning them and the security and network devices underpinning that, creating a matrix to understand interractions and dependencies.
I guess I’m saying, compliance, if you make it a flat “tick box” exercise will satisfy some of the people some of the time, but to be credible and foster long term internal and customer confidence, plus get the business to sign off spend for pricey fixes or accept the risk of things that can’t be fixed, you’re always going to have to circle back to good consistent ways to assess IS and IT risk.
Audit & Risk Specialist
“Bleedin’ obvious ain’t it? Well, yes, but its also rather more than most people want to handle. And of course much of it is information that is unavailable or guesswork. “
This is the problem with Black Swans, isn’t it? It’s not what you know that keeps you up at night, it is what you don’t know.
In this light, does anybody have any experience with using Bayesian logic to: (1) expose initial biases, and (2) using new evidence to reweight risk probabilities?
For example, auditors typically start with the initial bias that the outcome will be favorable, i.e., a 5% probability that the audit will uncover material failures. Using Bayesian logic is to possible to show that any negative evidence developed during the audit has to be overwhelmingly convincing to overcome this initial bias and, thus, will probably be discounted by the auditor.
But, if the auditor uses the new evidence to reevaluate the probably of failure, the greater the likelihood of an accurate prediction of the future.
Author, Security Transformation Expert
@Sarah Clarke; I agree.
My approach doesn’t involve fuzzy what ifs, I leave that to the insurance industry. Probability is really just fiction and most Executives that are worth their weight in salary will see straight through the make believe risks.
I evaluate threats using a specific set of criteria and I evaluate vulnerabilities based on a specific set of criteria. Even before I get started on the risk assessment I have evaluated and assigned a value to threats and vulnerabilities customized to the organization and industry.
I also evaluate existing controls to avoid negatively impacting the agility of the organization taking a less is more and quality management approach to information security.
The result is a risk rating based on a 100 percent scale that’s easy to measure against the risk appetite and make decisions.
True! Like it or not, Pecatore accurately describes the risk assessment process in many, many organizations
Jack Jones – Risk Management Executive
Late to the party – as usual. Just got approved to join the group.
John’s Rambling on Risk article and some of the comments above do a nice job of identifying common challenges associated with quantifying information security risk. Yes, many people struggle to effectively quantify the loss magnitude part of the equation. Yes, you can sink a lot of time and effort into quantification. Yes, Black Swans are always a concern. Yes, there are certainly multiple definitions for risk – at least one of which includes a positive side of the coin. And finally, yes, “wild ass guesses” can be a real problem. These are, absolutely, common challenges. That doesn’t mean, however, that they’re insurmountable — or even that difficult, at the end of the day.
I’ve been doing quantitative analyses of information risk for over ten years now, and these analyses (yes, even the loss magnitude side) have stood up under critical review. In some instances, considerable time and effort were required, but in many cases the level of effort was remarkably small — particularly if I’d done similar analyses where much of the data were reusable (which happens more often as you do more analyses).
As for the positive side of the risk coin — in infosec our charter is (at least in my experience) related to loss exposure, so I’ve chosen that as my primary focus. As soon as the business comes to me to perform an analysis on profit, then I’ll worry about the positive angle. BTW — although I’ve seen a number of standards and other references which mention the “positive side of risk” I have yet to see any of these spend more than 5% of the content (if that) actually discussing how to deal with that side of risk. Of course, maybe I’m just reading the wrong stuff. To avoid further angst though, you can substitute “loss exposure” for “risk” if it makes you more comfortable (as I sometimes do).
Wild ass guesses are a function of people not leveraging well-established techniques and methods for making the most out of weak data. Calibration techniques (read Douglas Hubbard’s book How to Measure Anything) and Monte Carlo functions allow you to perform useful, good quality analyses AND yet faithfully represent the sometimes awful nature of your data. In fact, some of the best discussions I’ve had with executives is when I’ve done an analysis where the data were poor. In those cases, the conversation often turns to how they can improve the quality of data and thus the results, if they feel the need. At the end of the day though, estimates should focus on accuracy rather than precision, which means that estimates should rarely if ever be given as discrete values, but rather as ranges and distributions to faithfully reflect the level of (un)certainty in your data.
Black swans are always going to be with us, which is why resilience in security architectures, processes, and policies is important. That said — and this speaks to disdain for probability as well — an organization still has to prioritize its investments in loss exposure management. If this isn’t based on probability, then what on earth would it be based on? Because without probability, we should exercise the same level of concern for our sun going supernova as we would for an e-mail containing sensitive information being sent over the internet.
Having said all of that, I agree that much of what passes for risk analysis (qualitative or quantitative) in our profession is superficially thought thru and questionable. But just because that’s the common case doesn’t mean it’s the only case.
I “liked” your comment, well, because I liked it and because I’ve been guilty here of calling out problems without offering solutions. A bit to kick off the debate, but it’s not ideal. Thanks for highlighting it’s not all insurmountable.
However, I still think the main crux of the challenge is elsewhere. Scaling the kind of good quality risk assessment effort you’re talking about, aggregating risks into meaningful high level “pots” that make sense to the board. Making sure that aggregation doesn’t wipe out your ability to achieve and prove movement to a more acceptable risk position with some SMART actions. Re-inserting that kind of robustness and quality if an organically grown risk framework has become bogged down in guesstimates.
With all due respect to security and risk consultants, I too can whip off a robust risk assessment for a given threat or vulnerability. Have done frequently to put newly discovered concerns in context or to put things hanging around to bed. It’s integrating that quality into the daily churn that takes the effort.
Splicing together outputs of incident management, audit, compliance and emerging risks intelligence. Creating and tracking real KRIs for agreed actions. Working with senior stakeholders to take consideration of risk down a level from “Cyber Attacks” and “Data Loss” to something that can be meaningfully aggregated out of available data and dealt with.
Circling back into solutions mode. Who uses a standard set of common risks to benchmark against for the board? Which standard set of capabilities or risks do you hang it off e.g. the governments 10 steps to cyber securityhttps://www.gov.uk/government/uploads/system/uploads/attachment_data/file/73128/12-1120-10-steps-to-cyber-security-executive.pdf. RiskIT? Entirely business driven themes that come out of bottom up and top down risk assessments? Any more thoughts? How is that going? And any more good insights into the aggregation challenge?
Jack Jones – Risk Management Executive
Sarah, Very nicely said, and spot on. There is, as you point out, a big difference between doing quality one-off analyses and being able to ensure that individual analyses performed by multiple people/groups within an organization are consistently of decent quality.
Achieving and maintaining quality analyses from across different people/groups within an organization begins with getting everyone on the same page with regard to a definition for risk and a method for analyzing it. It’s amazing (and discouraging) that coming to agreement on the fundamental definition of what’s being analyzed is even necessary. Foundational terms in disciplines like accounting, physics, and medicine are all established, yet in the risk management profession even the term risk (let alone threat, vulnerability, incident, etc.) has multiple definitions, depending on who you talk to. Once a common lexicon is established, I’ve found that progress on analysis and measurement at least has a chance. Here again though, there’s a learning curve within an organization on how to go about measuring risk effectively on a consistent basis. The organizations I’ve seen be most successful at this provide specific risk analysis training to people across multiple disciplines (audit, infosec, compliance, etc.) and then establish a peer review process to help ensure quality is maintained. I’ve seen some larger organizations train as many as 50 people in risk analysis, although I have seen success occur when even one person in an organization is trained in risk analysis. It’s tougher sledding, but at least the conversations and debates are enriched when someone brings a more mature perspective to the table.
What I just described sounds like a lot of work. And it can be. So the natural question is why any organization would go to the trouble when it’s so easy to simply slap a “High”, “Medium”, or “Low” label on risk issues and move on. Simply stated, these organizations have come to recognize that the imprecision and lack of rigor underlying the standard ordinal risk ratings approach isn’t allowing them to prioritize issues and choose solutions as effectively as they feel the need to do. They don’t want to just “do risk management”, they want to do it well.
Aggregation, in my opinion, requires quantitative analysis. You simply can’t add up or multiply a set of red, yellow and green issues (or ordinal 1 thru 5’s) and have it make any sense. With a set of quantitative analyses you can apply a simple set of Monte Carlo and Bernoulli functions to arrive an an aggregate statement of loss exposure. It’s not as hard as it sounds.
As for “common risks”, I’ve had success with three different risk categorization taxonomies. Which one I use depends on the organization and the scope of what we’re trying to tackle.
The first one is infosec focused and simply carves the risk landscape into five high-level types of events: Compromise of sensitive customer information, compromise of sensitive corporate information, online fraud, denial of service, and regulatory non-compliance. These of course can be tweaked to the needs of the organization. These risks lend themselves well to frequency and magnitude analysis, and can be applied at any level of granularity an organization chooses to use (individual business processes, technology layers, etc.).
The second taxonomy includes the larger IT risk landscape, and includes 29 risks. Since I’ve already made this response longer than probably anybody cares to read, I’ll hold off on listing those risks unless someone asks.
Hope this helps.
Senior IT Auditor
Sure I’ll ask. I could always use someone else’s perspective on what makes certain taxonomies better suited than another for various IT functions. The one I’m most familiar with is from SEI, the taxonomy of operational risk. Thanks.
Great answer Jack and thanks. Was wary of taking the discussion off piste and was just about to ping John to ask when his second part of the blog is out. Having said that, this discussion is the more valuable one in my opinion.
Although long your post it’s packed with good info. I’m also in a glass house throwing stones if I criticize aren’t I! I do Twitter as therapy.
The taxonomy info is great. My natural InfoSec home is ISO2700x, but for this, there is absolutely no substitute for good on the ground experience of making these things work. One side step to add to the “who does risk” discussion is that you can train many in risk analysis and organisations I’ve worked in have invested in that, but you have to be mindful of specific IS challenges when doing it. It’s almost a subdivision of the discipline.
The concerning theme I’ve seen in earlier patches of my career is IS risk being seen as security’s responsibility to own and manage. Yes. I know. “Security is everyone’s responsibility”, but culture change isn’t a snap your fingers job. It at can get fixed though. Getting stakeholders to grasp that for IS, the people owning the assets and processes at risk, the people operating the IT/IS controls, the people risk assessing both and the people who ultimately own those risks on behalf of the business are different bodies. None of them having all the info by themselves to assess a risk.
A rock solid risk RACI is therefore always high on my to do list when ever I kick some kind of risk analysis effort off.
Thanks again and looking forward to the next installment.
Jack Jones – Risk Management Executive
I think you’ll find the taxonomy I’m talking about to be similar in some respects to the SEI taxonomy, although at a higher level of abstraction. Mine is also more explicitly focused on IT and on events that you can assign a frequency (or likelihood) and impact statement to. For example, one of the SEI taxonomy elements is “Timeliness”, while one of mine is “IT Project Late Delivery”.
Clearly, this list would need some adjustment based on the industry of an organization. I’ve found it’s a decent starting point though, and at least helps keep the focus on risk versus controls.
- Accidental disclosure of sensitive consumer information
- Accidental disclosure of sensitive corporate information
- Business project budget overrun
- Business project late delivery
- Business project quality failure
- Data center outage
- Data loss/destruction
- HIPAA audit failure
- Internal audit failure
- IT organization budget overrun
- IT project budget overrun
- IT project late delivery
- IT project quality failure
- Loss of key personnel
- Malicious breach of sensitive consumer information
- Malicious breach of sensitive corporate information
- Material financial misstatement due to IT
- OCC/Fed audit failure
- PCI audit failure
- Product/service degradation
- Product/service outage
- Product/service quality/integrity problem
- Regulatory compliance failure
- SAS70 audit failure
- SOX audit failure
- Vendor deliverable quality failure
- Vendor failure
So, if you made it this far and have strong feelings about how risk is currently handled, why not add to the discussion.
Also look out for something by Sarah for the Tripwire State of Security blog coming soon. Thinking has moved on in the 8 months since this discussion started. No universally accepted solutions have emerged, but it will be delving into different challenges and pragmatic ways forward for those struggling with high level and operational security risk management.