Dr. David A. Cook is Associate Professor of Computer Science at Stephen F. Austin State University, where he teaches Software Engineering, Modeling and Simulation, and Enterprise Security. Prior to this, he was Senior Research Scientist and Principal Member of the Technical Staff at AEgis Technologies, working as a Verification, Validation, and Accreditation agent supporting the Airborne Laser. Dr. Cook has over 40 years’ experience in software development and management. He was an associate professor and department research director at USAF Academy and former deputy department head of Software Professional Development Program at AFIT. He has been a consultant for the Software Technology Support Center, Hill AFB, UT for 19 years.
Luke Muehlhauser: In various articles and talks (e.g. Cook 2006), you’ve discussed the software verification, validation, and accreditation (VV&A) process. Though the general process is used widely, the VV&A term is often used when discussing projects governed by DoD 5000.61. Can you explain to whom DoD 5000.61 applies, and how it is used in practice?
David A Cook: DOD 5000.81 applies to all Department of Defense activities involving modeling and simulation. For all practical purposes, it applies to all models and simulations that are used by the DOD. This implies that it also applies to all models and simulations created by civilian contractors that are used for DOD purposes.
The purpose of the directive, aside from specifying who is the “accreditation authority” (more on this later) is to require Verification and Validation for all models and simulation, and then also to require that each model and simulation by accredited for its intended use. This is the critical part, as verification and validation has almost universally been a part of software development within the DOD. Verification asks the question “Are we building the system in a quality manner?”, or “are we building the system right?”. Verification, in a model (and the resulting execution of the model providing a simulation) goes a bit further – and asks the question “Does the model build and the results of the simulation actually represent the conceptual design and specifications of the system we built?” The difference is that in a model and simulation, you have to show that your design and specifications of the system you envision are correctly translated into code, and that the data provided to the code also matches specification.
Validation asks the question “are we building a system that meets the users’ actual needs?”, or “are we building the right system?” Again, the verification of a model and resulting simulation is a bit more complex than non-M&S ”verification”. In modeling and simulation, verification has to show that the model and the simulation both accurately represent the “real world” from the perspective of the intended use.
These two activities are extremely difficult when you are building models and providing simulation results for notional systems that might not actually exist in the real world. For example, it would be difficult to provide V&V for a manned Mars mission, because, in the real world, there is not a manned Mars lander yet! Therefore, for notional systems, V&V might require estimation and guesswork. However, guesswork and estimation might be the best you can do!
5000.61 further requires that there be an ultimate authority, the “accreditation authority”, that is willing to say “based on the Verification and Validation performed on this model, I certify that it provides answers that are acceptable for its intended use”. Again, if you are building a notional system, this requires experts to say “These are guesses, but they are the best guesses available, and the system is as close a model to the real world as possible. We accredit this system to provide simulation results that are acceptable.” If, for example, an accredited simulation shows that a new proposed airplane would be able to carry 100,000 pounds of payload – but the result airplane, once built, can only carry 5,000 pounds – the accreditation authority would certainly bear some of the blame for the problem.
In practice, there are process for providing VV&A. Military Standard 3022 provides a standard template for recording VV&A activities, and many DOD agencies have their own VV&A repository where common models and simulation VV&A artifacts (and associated documentation) are kept.
There are literally hundreds of ways to verify and validate a model (and it’s associated simulation execution). The V&V “agents” (who have been tasked with performing V&V) provide a recommendation to the Accreditation Authority, listing what are acceptable uses, and (the critical part) the limits of the model and simulation. For example, a model and simulation might provide an accurate representation of the propagation of a laser beam (in the upper atmosphere) during daylight hours, but not be a valid simulation at night, due to temperature-related atmospheric propagation. The same model and simulation might be a valid predictor of a laser bouncing off of a “flat surface”, but not bouncing off of uneven terrain.
Luke: Roughly how many accreditation authorities are there for such projects? Do accreditation authorities tend to specialize in accrediting V&V in certain domains — e.g. some for computer software, some for airplanes, etc.? Are there accreditation authorities that the DoD doesn’t recognize as “legitimate”?
David: Accreditation authorities are simply the component heads who sign a letter saying “Model and Simulation X is approved for the following purposes”. The letter then states what the intended uses are, lists any special conditions, and lists the limitations of the model and the simulation. The accreditation authority is more of a position rather than a person. It can be a person (usually the head of the organization), or a committee.
Each DOD agency is responsible for models and simulations that it develops or uses – they must either VV&A their own models and simulations, or use models and simulations (from a trusted source) that has performed their own VV&A. Note however, that each DOD agency probably has their own data – which must also be be accredited. Every project has it’s own M&S, and probably has domain experts who perform the VV&A. Rather than go to the top for accreditation, each project probably has been delegated the authority to perform its own VV&A.
There are no non-legimate accreditation authorities per se; accreditation authorities are not authorized based on knowledge, simply on position. However it is assumed that each M&S area has domain experts who have the specialized knowledge in the application area to perform reliable VV&A. These domain experts span many areas – application domain experts (who might, for example, be an expert on a laser beam), coding domain experts (who can verify that the code is a good representation of the requirements), data domain experts (who verify that the targeting data represents a valid target), and perhaps many others. Typically, each project has a VV&A team or “agent” who perform the V&V, and recommend accreditation (usually in a formal letter) that restates the intended uses and limitations, The recommendation includes all associated artifacts, such as test results, reviews, reports of individual Verification and Validation activities, other models and simulation used to compare against, real-life data (to show validity), and possibly many other items.
If a particular DOD agency is using a model and simulation in its exercises, it is responsible for VV&A of its own M&S. If, on the other hand, an allied agency is using a model that includes artifacts from an another agency – the outside agency is responsible for working to make sure that the model, simulation and data repressing them is valid. In essence, each DOD agency is responsible to other other DOD component to ensure that their forces and capabilities are appropriately represented to all outside agencies utilizing models and simulation that involve them.
Luke: Are there historical cases in which a model or simulation completed the VV&A process, but then failed to perform as it should have given the accreditation, and then the accreditation authority was held to account for their failure? If so, could you give an example? (Preferably but not necessarily in software.)
David: Because I worked as a consultant on many modeling and simulation projects, I am ethically prevented from discussing actual failures that I know about – mainly because most of the projects I worked on were classified, and I signed non-disclosure agreements.
However, by shifting into hypothetical scenarios, there are several stories that I can use that best illustrate this. One is a story taught in many simulation classes – and I only have secondhand knowledge of it. The other two are ancedotal – but good lessons!
In the first instance, a model was used which predicted “safe runway distance” for an airplane. Feed into the model the weight, altitude, temperature, and humidity, and run the simulation to predict how much runway was needed.
Unfortunately, the day the model was used, it took several hours for the airplane to actually takeoff. It had a bit more fuel that estimated – adding weight. By takeoff time, the temperature had risen, giving “stickier” tires and runway, and decreasing air density (giving less lift) Also – the humidity had changed, also affecting lift characteristics.
The model did not have a large allowance for error (it tried to give a relatively precise answer) – and with all the factors changing, the airplane went from “enough runway” to “marginal” after the simulation had been run. Combined with a relatively inexperienced pilot (who did not advance the throttle fast enough) – and the airplane overshot the end of the runway. Not much damage (other than a bruised ego) – but the simulation – while accurate, was not used properly.
The other two stories are certainly imaginary – but are passed around like legend in our field. In the first story, in the early days of the Airborne Laser, a very simple model was used to predict laser propagation. Code was reused to model the laser – basically, code for a missile, with the speed of the missile increased to the speed of light. The targeting acquisition, target recognition, etc. were all similar, and once fired, the simulation would show if the target was hit. Until the first time they ran it, halfway to the target, the laser beam “ran out of fuel” and fell into the sea.
The second (certainly imaginary) story involves modeling a battle scenario for the Australian Air Force – using helicopters. One of the problems with landing a helicopter was making sure it had a clear landing field – and kangaroos were a problem. So – the developers of the battle simulation, who used object-oriented development, took some code which was basically used to model a ground soldier and modified its behavior to “run at the sound of helicopters”. They then changed its appearance on the simulation to show a small image of a kangaroo. When to model was executed, the simulation showed the kangaroos running away from the helicopter. Until it landed, and then the kangaroos reversed direction, and attacked the helicopter with rifles!
Ok – the last two examples are cute and funny, but show the problems with invalid assumptions and imperfect data.
I leave this question with a quote that I always use in my M&S classes – “All models are inaccurate. Some models are useful anyway.” It is extremely difficult to model the real world totally inside of code. I don’t care how well you model a hot tub in a simulation – it is NOT really a hot tub.
There are always things you do not consider, data that is not perfect, or constraints that you miss. A model is a best-guess approximation of what will happen in the “ dal world” – but it is NOT the “real world”. All models have limitations. In spite of that. the resulting model and the simulation still give useful data. The accreditation authority is simply acknowledging that the model and simulation are useful “for their intended use”, and that “limitations exist”. No reputable modeling and simulation expert (nor any accreditation authority) trusts a single model and its resulting simulation to produce data that is used in life-or-death decisions. Multiple sources of validity are required, multiple independently-developed models and simulations are used, and domain experts are consulted to see if the results “feel right”. And tolerances must always be given. An aircraft might encounter a puddle of water when trying to takeoff. It might hit a bird. Both of these decrease speed, requiring longer takeoff distance. It’s hard to model unforeseen circumstances. If you include a “fudge factor” – how much “fudge factor” is correct? Before an accreditation authority accepts a model or simulation as reliable, many, many steps must be taken to make sure that it produces credible results, and equally as many steps must be taken to make sure that the limitations of the model and simulation are listed and observed before accepting the result of the simulation as valid.
Luke: How did the VV&A process develop at DoD? When did it develop? Presumably it developed in one or more domains first, and then spread to become a more universal expectation?
David: Interesting question. Before we can discuss VV&A, we have to take a slight detour through the history of M&S. And I need to tie several threads of though together.
VV&A is, of course, tied to the use of models and simulations. To be honest, the VV&A of models goes back to the civil war (and probably earlier) – when mathematical models were used to predict firing data (given desired range, here was amount of powder and elevation required. Obviously – the models needed a lot of V&V. However, all it too to V&V the model was to load a cannon and fire it. Not a complex process. The accreditation part was implicit – the Secretary of War used to “authorize” the data to be printed. To really need VV&A, however, complex simulations were needed – and it took computing power to achieve complex M&S.
Over the years, modeling became more and more important, as models and simulation were used for problems that could not easily be solved by traditional mathematical methods. To quote from Wikipedia article,
Computer simulation developed hand-in-hand with the rapid growth of the computer, following its first large-scale deployment during the Manhattan Project in World War II to model the process of nuclear detonation. It was a simulation of 12 hard spheres using a Monte Carlo algorithm. Computer simulation is often used as an adjunct to, or substitute for, modeling systems for which simple closed form analytic solutions are not possible. There are many types of computer simulations; their common feature is the attempt to generate a sample of representative scenarios for a model in which a complete enumeration of all possible states of the model would be prohibitive or impossible.
VV&A really did not become a serious issue until models and simulations were computerized, and computers did not become available until the late 1940s. Starting in the late 1940s, both digital and analog computers became available. However, very few (if any) engineers were trained on how to use this newly-developed computing power. There are various stories of how modeling and simulation became a powerful force in the DOD, but the story I have personal knowledge of is the story of John McLeod – an engineer working at the Naval Air Missile Test Center at Point Mugu on the California coast north of Los Angeles. John was an innovator and after working on analog computers and simulations in the early 1950s, , was John McLeod, who took delivery of a new analog computer sometime in 1952. John was not the only engineer in the aerospace community in Southern California facing the same problems, and a few of them decided to get together as an informal user group to exchange ideas and experiences. To make a long story short, John helped found what became the Society for Computer Simulation (SCS). This organization, over the years, has had members who were leaders and innovators in the field of modeling, simulation, and VV&A. [Note that I had the privilege to be the President of SCS from 2011 – 2012, so I am a bit biased]. The SCS has, to this day, the McLeod award to commemorate the advances John McLeod made in the M&S arena. It is only awarded to those that have made significant contributions to the profession.
The SCS published newsletters. M&S conferences were organized. Leaders in the field were able to meet, publish, and share their expertise. All of which help integrate M&S into more and more domains. As a result of leaders in the field being able to share M&S information, and also as a result huge increase in capabilities and availability of computers to run M&S, the need for VV&A also increased. Over the years, modeling and simulation became more and more important in many domains within the DOD. It helped develop fighters (in fact, aircraft of all types). It helped train our astronauts to land on the moon. It modeled the space shuttle. Complex models and simulations helped us model ballistic missile defense, fight war games with minimal expense (and no lives lost!), and design complex weapon systems. In fact, it’s hard to imagine any technologically sophisticated domain that does not use M&S to save money, save time, and ensure safety. But – these increasingly complex models needed verification and validation, and frequently accreditation,
So – the proliferation in the use of M&S lead to an increased need for VV&A. M&S became so complex that VV&A could not be accomplished without “domain experts” – usually referred to as “Subject Matter Experts” (SMEs) to help. Increased complexity of the M&S lead to increased complexity of the VV&A. Various elements within the DOD were performing VV&A on their own, with little official coordination. To leverage the experience of various DOD components and multiple domains, the DOD saw the need for a single point of coordination. As a result, in the 1990s, the DOD formed the Defense Modeling and Simulation Office (DMSO). The DMSO served as a single point of coordination for all M&S (and VV&A) efforts within the DOD. One of the best DMSO contributions was the VV&A Recommended Practices Guide (VV&A RPG) – first published in 1996. The guide has been updated several times over the years, reflecting the uncreased importance of VV& in the DOD. In 2008 DMSO was renamed the Modeling and Simulation Coordination Office. The MSCO web site (and the latest version of the VV&A Recommended Practices Guide) can be found at msco.mil
For those of you interested in M&S and VV&A, I cannot recommend the MSCO resource enough. It costs nothing (not even email registration), and contains a vast amount of information about M&S and VV&A. The RPG Key Concepts document alone contains 34 pages of critical “background” information that you should read before going any further in VV&A.
Luke: In Cook (2006) you write that one of the reasons V&V is so difficult comes from “having to ‘backtrack’ and fill in blanks long after development.” What does this mean? Can you give an example?
David: Let’s imagine you are designing a new fighter aircraft. It is still on the drawing board, and only plans exist.
Rather than spend money building an actual prototype first, you develop mathematical models of the jet to help verify the performance characteristics. You might actually build a very small model of the body – maybe 1/10th size for wind tunnel experiments.
You also build computer-based models and execute them to estimate flight characteristics. The wind-tunnel experiences (even though only on a 1/10th size model) will give data that might make you modify or change the computer-based model. This feedback loop consists of “build model – run simulation – examine data – adjust model” and repeat.
Eventually, you build a working prototype of the jet. Almost certainly, the actual flight characteristics will not exactly match the computer-based model. The prototype is “real world” – so you have to readjust the computer-based model. The “real-world” prototype is just a prototype – and probably not used for high-speed fighting and turns – but the basic data gathered from the flying of the prototype leads to changes in the computer-based model, which will now be used to predict more about high-speed maneuvering.
Back when I worked on the Airborne Laser – we had models that predicted the laser performance before the laser was actually built or fired! The models were based on mathematical principles, on data from other lasers, and from simpler, earlier models that were being improved on. Once a working Airborne Laser was built and fired – we had “real world” data. It was no surprise to find out that the actual characteristics of the laser beam were slightly different that those predicted by the models. For one thing, the models were simplistic – it was impossible to take everything into account. The result was that we took the real-world data, and modified the computer models to permit them to better predict future performance.
The bottom line is that the model is never finished. Every time you have additional data from the “real world” that is not an exact match to what the model predicts, the model should be examined, and the model adjusted as necessary.
There are two terms I like to use for models when it comes to VV&A – “anchoring” and “benchmarking”. If I can get another independently-developed model to predict the same events as my model, I have a source of validation. I refer this as benchmarking. Subject matter experts, other simulations, similar events that lend credence to your model – all improve the validity, and provide benchmarking. Anchoring, on the other hand, is when I tie my model directly to real-world data.
As long as the model is being used to predict behavior – it needs to continually be tied or anchored to real-world performance, if possible. If no real-world data is available, then similar models, expert opinions, etc. can be used to also increase the validity.
Just a final note. Models can become so engrained in thoughts that they become “real. For example, I remember when the recent Star Trek movie (the 2009 version) came out. A friend of mine said, after viewing the movie, that he had trouble the the bridge of the USS Enterprise. It did not “look real”. I asked what “real” was – and my friend replied “You know, like the REAL USS Enterprise, the NCC 1701 (referring to the original series). Think about it – all are notional and imaginary (sorry, fellow Trekers) – yet he viewed one as “real” and the other as inaccurate. Models – when no real-world artifact exists – have the potential to become “real” in your mind. It’s worth remembering that a model is NOT real, but only an artifact built to resemble or predict what might (or might not) eventually become real one day.
Luke: Do you have a sense of how common formal verification is for software used in DoD applications? Is formal verification of one kind or another required for certain kinds of software projects? (Presumably, DoD also uses much software that is not amenable to formal methods.)
David: I have not worked on any project that uses formal V&V methods.
I used to teach the basics of formal methods (using ‘Z’- pronounced Zed) – but it is very time consuming, and not really fit for a lot of project.
Formal notation shows the correctness of the algorithm from a mathematical standpoint. For modeling and simulation, however, they do not necessarily help you with accreditation – because the formal methods check the correctness of the code, and not necessarily the correlation of the cod eight real-world data.
I have heard that certain extremely critical applications (such as reactor code, and code for the Martian Lander) use formal methods to make sure that the code is correct. However, formal methods take a lot of training and education to use correctly, and it also consumes a lot of time in actual use. Formal methods seldom (never?) speed up the process – they are strictly used to validate the code.
From my experience, I have not work on on any project that made any significant use of formal methods – and in fact, I do not have any colleagues that have used formal methods, either.
Luke: Thanks, David!