CAPEC User Summit Transcript - “CAPEC Entry Completeness and Quality”Steve Christey Coley, CAPEC/CWE ProgramSession 4 - CAPEC Entry Completeness and Quality | View all Summit transcripts (00:00) Speaker: Rich Piazza (Summit Host)We're going to have a discussion led by Steve Christey Coley, who is the CWE/CAPEC Tech lead. Steve take it away. (00:05) Speaker: Steve Christey ColeyGreat, thank you, Rich. Hello, everybody, I'm glad to see all of this interest of CAPEC here. Let's go to the next slide. So, a brief summary of what I'll be talking about over the next few minutes. We've had, over the past few years, a high level goal of modernizing CAPEC as well as CWE. And what we're trying to translate that into is really looking to understand and align CAPEC content with what the community needs, and so I'll be touching a little bit on what we've been doing - some of the recent activities that we've been going through in trying to improve CAPEC content and really lead towards getting more direct community inputs, so that we're sure that we are appropriately on track for meeting what the different communities’ needs are. And as part of that, there's going to be a live poll that we’ll have in about 15 or 20 minutes, talking about some things that are important to you. One other thing to note is - a lot of what I'm talking about today is, we're using a similar approach within CWE to try and really help us to focus all of our content development work in ways that really serve what our communities need. Next slide. (1:17) Speaker: Steve Christey ColeySo, what are some of the high level goals for us in terms of modernizing CAPEC? One of them is that, as I mentioned, we want to really work with the communities. We need to identify who our stakeholders are - identify what their needs are - and we've been doing that through a couple of different ways, primarily through user communities. That's both the CWE/CAPEC Board, the user experience working group, all of you who are participating within the summit today and filling out polls, and other kinds of user communities. As well, there are 3 areas that a lot of our content development work can be broken down into. One of them is in the area of repository coverage: what should CAPEC be covering in the first place? What kinds of attack patterns should it be covering? And then there is a notion that I'm calling “entry completeness” within an individual CAPEC entry. What fields - what data elements - are the most important for that entry, that we be sure that we are providing information about them? There is a little bit fuzzier kind of thing related to the quality of the entries that we produce. And really, our goal there is how do we ensure that all of the fields or data elements are really sufficiently up-to-date, that they've been independently validated, and that they're sufficiently accurate and precise? All of these are good questions to ask. If we could be perfect right out of the gate, we would definitely want to be. However, the real world intervenes, so we have to ask some of these kinds of questions. Next slide, please. So, to touch on repository coverage. A lot of what we wrestled with boils down to two different questions and different activities related to that first question: does CAPEC really cover all of the different kinds of attack patterns that the community needs? That could include new and emerging techniques or attack patterns, as well as older ones that we didn't necessarily get around to or that we weren't necessarily fully aware of well enough in order to turn into a CAPEC entry. There's also another question that we're wrestling with a lot more in the past year or two, which is: is CAPEC’s scope broad enough? There can be attack patterns in lots of different areas of technology, lots of different domains. Are there particular sub-communities that are looking for something different from CAPEC than what it's providing currently? So we’ve already talked a little bit about the hardware domain and the transitions that we’re trying to make with respect to CAPEC as well as CWE, and then a little bit later we'll be talking about supply chain as well. Those are two domains. For hardware, it's pretty new, but for supply chain, we've supported that within CAPEC for a number of years. But we're having some renewed energy in trying to figure out how we support that. And then for use cases - pen testing is one of the areas that we've been looking at more and that's been already discussed earlier. With respect to all of this, we recognize that CAPEC will be better if we have active input and active support by the real experts who are out there. There's a huge world of knowledge. No individual is an expert in everything, and so the work that we can produce really can benefit by a lot more direct and active engagement with all of the different experts and specialists who are out there. So that's part of what we're trying to do as we seek to improve coverage across the entire repository: to have more direct community engagement, to find those experts, and help them to help us to help make CAPEC better so it helps everybody. Next slide please. (5:24) Speaker: Steve Christey ColeySo when it comes down to quality - this kind of fuzzy thing - we have been doing a number of different efforts and we've had some different priorities that we've been working more recently. We've been working on making descriptions shorter. We created a new element called Extended Description where important details can be kept, that we think helps with usability. We've had some special efforts in mappings to related weaknesses - to lower level weaknesses - related to access control. We've been trying to get some consistency within CAPEC as well as between CAPEC and CWE when it comes down to understanding and capturing the kinds of common consequences that can occur from exploitation of a weakness in the form of an attack pattern. Another area that we've started working on is ensuring that we're providing good quality mitigation information. That's also consistent, not only between CWE and CAPEC, but we've also been collaborating with members of the D3FEND project, which is like similar to ATT&CK, but it's really about capturing all of the different defensive techniques that can an organization can take in order to protect itself from various adversaries - and they deal with a lot of areas of mitigations as well. We're doing this kind of collaboration for mitigations and, as was discussed earlier, we've been making a number of improvements to existing execution flows within CAPEC entries, so that overall we can really improve the quality of the entries that we have out there, as well as the quality of the entries that we would produce. Next slide. (7:10) Speaker: Steve Christey ColeySo, entry completeness is really about understanding what fields are the most important to users and making sure that we have those fields filled in, providing useful information. In the past few years, we've been trying to make sure that new CAPEC entries - when they come out - are a lot more complete. They have a lot more fields that have been filled, and generally they're higher quality. But we've also been looking back at older entries and trying to do things such as execution flows to make sure that for most Detail level CAPEC entries, that we have some execution flows. Or doing work on improving the overall completeness of entries related to a particular area such as supply chain, and also beefing up mappings to external efforts such as ATT&CK. Next slide. (8:00) Speaker: Steve Christey ColeyWell, how would we determine what is important, or what seems to be important? We've conducted a bit of an analysis, especially in the last year. In the past, on the CAPEC team, based on discussions with users and so on - but not in any formal sense - we tried to guess at, and estimate what the community generally thought was most important, and we laid out certain kinds of criteria for that. But last year, we had a specific effort in working with the User Experience Working Group, to actually ask them to weigh in and say: what fields, what values are important to them? And then we did a comparison, basically. So, we had a lot of the usual suspects. There was agreement between the CAPEC team as well as the user experience working group in areas such as the description, prerequisites for attack, related weaknesses, consequences, and mitigations. But then there were other cases where the user experience working group said that there were some things that we thought were important on the CAPEC team, but wasn't really particularly important to them, one of the most interesting being the notion of execution flow. And there were other areas where there was some slight disagreement there. But it should be noted that this was an effort with the User Experience Working Group, which was a pretty small sample size. And so while it was very informative and very helpful to us, we do want to get more input from the broader community. And so that’s where we’re headed into a live poll in a moment. Next slide, please. (9:46) Speaker: Steve Christey ColeySo we've been trying to develop metrics for how we measure CAPEC entry completeness. We have sort of pre-chosen a set of 17 fields, where we could evaluate those fields that are sort of the highest priority, and then analyze individual entries to figure out how complete they are with respect to those high priority fields. We've done this as a way to develop and test the methodology, and to make sure that it makes sense. So for example, if you have an entry that has 16 out of the 17 high priority fields, we can say “that entry is 94% complete.” But one thing that we've realized is that some entries can't necessarily have all of the particular high priority fields. For example, there might be a field that's determined to be high priority related to examples. But if there aren't any known or publicly documented real world examples, we can't necessarily fill that out. And as I had mentioned previously for things like execution flows, it doesn't necessarily make sense to have execution flows for certain entries if they're really high level of abstraction such as Meta level abstraction. But working on this methodology has helped us out and now we're starting to really work towards that. Next slide. (11:07) Speaker: Steve Christey ColeySo here's a chart of all of the different fields – there are 17 high priority fields, and somebody can double check the number of fields there - showing how complete we are across the entire repository, relative to those particular fields. So right in the middle, you can see execution flow, as being a pretty small bar where it's covered in 227 CAPEC entries. But as I've just noted, it's not expected that every single attack pattern entry will have an execution flow in the first place, but nonetheless you can visualize some of the areas where we potentially could make improvements across the entire repository for various entries in certain key fields. Now again, all of this is related to the methodology that we've been trying to develop and now we really need to ask the question of: what are the highest priority fields that - not only just what we're thinking about on the CAPEC team side of things, but really, what does the community want to do? And that's where we're going right now. Next slide. (12:23) Speaker: Steve Christey ColeySo we're going to do a live participant poll with you. This one is a little bit more comprehensive than some of the previous polls. But we are going to ask you now to start participating in that poll. Next slide. (12:39) Speaker: Steve Christey ColeyBasically, we're going to ask you about all of these different fields that we need your opinions on: how important are each of these individual fields to you and to your work, and to how you use CAPEC? And we'll be doing that online. (12:58) Speaker: Steve Christey ColeyWe're allowing for maybe around 10 minutes or so for people to think on this, but you can visit the poll site. And what we want you to do to help us out, is to see each individual CAPEC field, and then score the field based on what your prioritization is. If you don’t have a particular opinion about a field, you can give it a zero. But otherwise, you can use this range of values, 1 being irrelevant, 2 meaning occasionally useful, 3 nice to have. 4, a field is really important to you, and then 5, whether you regard it pretty much as essential. If you need to look and understand what some of these fields are, we did send out an email to you yesterday, which had the subject line of “prioritization of fields in CAPEC.” So what we're going to do is we're going to pass this on to everybody to go through and give us what some of your assessments are, and then at some point within a matter of a few minutes, once Alec has seen that we're starting to get some results, we’ll start sharing some of those results and have a little bit of discussion. (14:10) Speaker: Alec J. Summers - CAPEC/CWE ProgramThank you. Steve. I went ahead and put the link directly to the poll in the chat. So if you want, you can click right there. That will take you right to the poll itself. If for some reason you don't see this actual question, you might just need to click in the footer where it says “Polls” in your browser. That should take you right to it, and you just rank these things according to the shared screen that Steve has there with zero to 5. (14:37) Speaker: Steve Christey ColeyAnd while people are filling this out, Rich - this is maybe putting you on the spot, but you've been leading a lot of the content production efforts over the past few years, so if you want to discuss a little bit of what your approach has been and what you are thinking has been for some of the prioritization that we've done so far. (14:56) Speaker: Rich PiazzaYeah, sure I can maybe say a few things off the cuff. What we're saying is that we're trying to modernize CWE and CAPEC. We're really trying to look at these corpuses from the ground up and see what these fields hold, how interesting they are, are they giving the users the right information. Is there different information for different user personas, etc. and so forth. That's one of the things that we're really trying to understand - and I don't know Steve if you are going to present this or have already presented – is but there’s certain things that we’re looking at to see if we can make them better, so one of the ones that we have discussed in the first session was execution flows. Execution flows, we feel are very important, especially for the pen testing use case. It explains what the attack is and how to do it and really gives a detailed information of what's going on more than just the few paragraphs that's in the description field. But there are issues with execution flows and some of them have to do with the fact that CAPEC is a hierarchical corpus, so we have different levels of detail. We go, from the category, which is the highest level to a meta level, standard level and lastly to a detailed level. And so the question is, is it appropriate for every CAPEC at every level to have an execution flow and the answer, which we discussed, was probably not. It probably makes sense for the detailed ones to have execution flows and maybe some of the others, depending on what the hierarchy looks like for that particular part of the subtree. But the other thing is maybe there's really very little to add in a child's execution flow that you couldn't find in the parent’s execution flow and the question is, is there some way we should represent that, instead of just copying word-for-word the exact same execution flow at each level of the tree. Maybe it makes sense to have some other way to present that information, so you can get a whole feel for the one CAPEC and its parents and get a more complete picture. We're also doing some work about consequences, the scoping of consequences. There's the usual CIA scopes that you probably all know about and then underneath those, there's particular impacts and we'd like to make that clear. I think everybody who's been working on in this field knows that CIA was - is a great idea, but hard sometimes to kind of draw the lines between what's confidentiality and or integrity, that sort of thing. Those are some of the things we’re working on. Steve, do you have anything else that you thought I should mention? (19:04) Speaker: Steve Christey ColeyYeah, I think this effort has been helping us to at least focus a little bit more on what seems to be important. And so we are really looking towards community input on this so that we can make more or less some final decisions and then be sure that we're optimizing, as much as possible, all of our content production going forward. I would like to talk briefly on a comment in the chat from Jim, who says it would be interesting to know what user personas the User Experience Working Group has in mind, because different personas have different knowledge interest. And that's definitely the case. We did do some work on user personas last year, but I personally believe it still needs a little bit more work. But at the very least, we want to be sure that we're supporting the most important personas, and that kind of analysis sort of depends on the UEWG making further progress on defining and understanding personas than it has made at this point in time. So we might wind up making some additional shifts at a later point in time. But we're trying to react and make this shift to more community involvement as much as possible, while also continuing to put out some content. Alec, let me follow up with you here. Are we getting any results? it wasn't necessarily clear how long it might take people to work on this. (20:42) Speaker: Alec J. SummersWe are indeed. I think yes, you're absolutely right, there will be a variance in terms of the time. However, I can go ahead and share my screen here. I have this cool little radar graph - spider graph or we want to call it. (20:54) Speaker: Steve Christey ColeyGreat, one thing to note for everybody – and as Alec has put in the chat - this poll is still open for input. But we also recognize that some people might be agonizing over some of their decisions, and we will be reopening it a little bit later to give people a little bit more time as well. (21:12) Speaker: Alec J. SummersNo, I appreciate that input. Thanks, Steve. Yes, so here we can see - hopefully you can see my screen. This is sort of a radar graph. Based on the data that we're showing, we've got an average rank in a median rank. One immediate thing that calls out to me, of course, those are the ones that have received essentially a very high rank and that being Related Weaknesses and Mitigations. We've heard especially recently around Mitigations, and the value there in not just enumerating the attack better themselves, but where the appropriate opportunities for mitigating these attacks at different stages, and so that's one there. I see we've got 30 people online contributing. I don't know if that includes all the people that have successfully submitted, yet one reminder. I believe if you're in the poll, you do need at the very end to submit your answers. There is a purple button at the bottom that says “submit answer,” and I think that does require completion before your data is accepted. (22:23) Speaker: Steve Christey ColeyThank you Alec. It does seem to indicate about 28 people have replied to the poll already or so. Is there other visualizations that we could look at besides the spider graph here? (22:41) Speaker: Alec J. SummersYes and actually this is one that we can see just the counting up. I don't know if there's another visual as far as the outcome. I think this is going to be something that we can share certainly after the fact once we get all the data in. I'm happy to leave it on either one if you want to speak to this one to Steve that's fine. (23:02) Speaker: Steve Christey ColeyYeah, I'll speak to this, a little bit. I think - let's go down another page. What you can see here really is the variety of opinions that there are. That said, there seems to be pretty heavy agreement about the importance of covering consequences, execution flows where there's not necessarily a huge amount of interest. But there is some interest there, it seems. For mitigations, now this is a very clear result that people are believing that it's really very, very important or are pretty much essential for us to be covering mitigations. So that's a good indication. For Example Instances, I wonder: these results are kind of consistent with what came from the UEWG before, but one thing that's interesting to me is, I would think that examples would be very useful for people - at least those who are trying to learn new attack patterns and things like that, but maybe that's just one particular kind of task or one particular kind of user persona that might benefit from the examples, so not seeing a huge number of people really being totally Gung Ho for it. Ah, good to know that there's a lot of support for related weaknesses because as Rich did say, CAPEC is hierarchically organized, and so by necessity we need to keep it hierarchically organized. That I think, is good enough for now here, and I'll just take a quick look at the chat and maybe make some comments there as well, to respond to some of the comments. So, one comment from Massimiliano – apologies if I made a mistake in your name - talking about execution flows. They're really necessary for detailed attack patterns and sometimes for standard attack patterns. And there was a comment that even if expert pen testers don't rely on the execution flow so much, they could be really useful to less experienced people. There's a question in the chat about how does one get involved in the user experience working group? Let us know, but we’ll make sure to get that info into the chat on for you. The question of completeness and quality has two contexts for the CAPEC website as well. As for the downloadable XML, there are things on the website content that are not easily accessible within the XML such as domains, abstraction level and so on. And then, will we address updates to XML content. This is part of why we rely on you, the community, to give us feedback for what you need, and the user experience working group is one of those attempts at that. We will be forming other working groups, we anticipate, in the relatively near future, where the topic of what kind of data needs to be accessed and how easy it is to access that data will be a pretty big part of the discussions. So we will be having more discussions with everybody in the future in that area. Another comment in the chat saying one factor to keep in mind is that our votes are probably dependent on the experiences using CAPEC, which is in turn is dependent on the information available in the CAPECs that are used, and I think that's a very fair point as well. Well, we're trying to make improvements where we can with the data that we have and the data that we can get so, so we recognize that. (27:32) Speaker: Steve Christey ColeyFor our next steps in CAPEC, we are looking to continually improve on CAPEC’s completeness. Thank you everybody for your polling results either now or later today. Once we've determined a final set of high priority fields, we’ll document them and will make sure that we publish them and will be continuing to seek active community contributions. We really rely on you to provide your expertise. But as we open things up more to external contributions, we also face a question of becoming not necessarily the producers of information, so much as the stewards of it, and ensuring that we're somehow maintaining quality there. There's a lot that we’ll need to discuss and figure out moving forward. But that's basically where we're at. More information is available — Please select a different filter. |