The scalability of mission operations without significant automation is limited. Assuming that in the best case scenario, a single small satellite would require a team of ten operators, not including payload operators, and that the number of operators scales linearly with the number of telemetry points to monitor (a large assumption), then a constellation of hundreds of satellites would require thousands of operators and an inordinate operations budget. Furthermore, such systems are likely to require many more operators due to non-linearly scaling workload associated with interactions and dependencies which can result in complex behavior and even unexpected emergent behavior. We hypothesize that automated methods of focusing attention, abstracting information, detecting events such as faults and opportunities, reacting to detected events, and performing onboard automatic replanning could greatly decrease the workload that is placed upon human operators, even with large volumes of telemetry at high data rates.
A common objection to automation is that automation itself incorporates even more complexity into a system. One significant flaw in this objection is that it ignores the fact that humans add significant complexity and unpredictability to the behavior of an overall system when human and computer interaction is considered. Functions allocated to human operators may in some cases be performed far more optimally than with automation, but the ability to predict performance is nearly impossible. Studies show that human operators are a source of faults, and often faults due to human operators are difficult to trace or resolve [ 10]. While it is true that automation does add complexity to the system, this is simply a testing and validation problem. Automation of functions has the distinct advantages that it can be validated, and is much more reliably scalable than systems which rely upon extensive human-computer interaction at low levels of abstraction.
A fundamental concept of the architecture presented here is that development of automation leading toward greater operational autonomy is stepwise, with incorporation of a testbed to continuously validate and revalidate the system as it is evolved. Therefore, automation of tasks performed initially by humans who can bring significant common sense reasoning and cognitive abilities to bear upon ill-defined problems, can be incorporated as these tasks are better understood and when human cognitive skills have less payoff. As a given mission operations system is evolved toward greater overall autonomy with proven automation, then the scalability of the overall system is greatly increased since the requirements placed upon the human operator to maintain the performance of each subsystem are reliably decreased.
Features of the mission operations system architecture proposed and explored in this paper which lead directly toward operator load reduction and system scalability include:
The mission operations management by exception achievable with this architecture can lead to an order of magnitude reduction in human attention required. For example, in the case of telemetry monitoring and fault detection, if a subsystem which includes tens of sensors can be automated to self-monitor and produce a single health and status word or indicator, then an order of magnitude reduction is achieved. With implementation and validation of such localized self-monitoring applied across multiple subsystems with regression testing to identify faults identifiable only at the systemic level as subsystems are combined, the load of monitoring is reliably reduced by an order of magnitude or more. A common problem with automated detection is performance with respect to number of false alarms raised compared to probability of failure to detect an anomaly. This is a matter of tuning and effectiveness of the methods chosen for a specific monitoring task. The ability to use the testbed to support tuning and method selection is therefore required.
Reactions to many detected faults are often benign. For example, if the power draw of an instrument is anomalous, and the instrument is in a calibration mode during a non-critical phase of the mission with respect to science data acquisition, then sating the system by switching off the instrument is a straightforward flight rule that is simply automated. The inclusion of a rulebased system allows for such flight rules to be easily captured with rules that include modal and contextual logic along with detection layer decision boundary logic. For example, the identification of sating flight rules would be a first level of reaction automation based upon the detection layer. The only real danger of such fail-safe automation is that false alarms which cause the system to be spuriously safed lead to unnecessary opportunity loss and reduction in overall performance. This is why tuning and validation of the detection layer is so important. More complex reaction automations include resource management, scheduling, and performance main- tenance despite faults and system degradation. By automating the simplest, most detectable and highest frequency event-reaction pairs, the telemetry monitoring and commanding load associated with mission performance maintenance is reduced.
While detection to reaction linking can reduce human operational load by an order of magnitude alone, automatic replanning can further reduce the load. In fact, replanning is really a very highly automated type of reaction. One of the significant leverages of this architecture is that automated planning and replanning can be built upon the automated reaction rules and with constraints (which also may be applied to human interfaces to control human computer interaction faults) which can be used as the kernel of the automatic planning system. The rules tied to the detection layer can be used to trigger replanning which can use constraints with well proven planning and scheduling algorithms to generate command sequences and perhaps even additional rules and constraints. Onboard planning has the greatest automation leverage and risk, but with the systematic approach of achieving higher levels of automation proposed here and supported by the proposed architecture, the risk is greatly mitigated. With the detection and reaction layers of automation, an order of magnitude reduction in work load may be achieved, and with automatic planning based upon these layers, further orders of magnitude reduction can be achieved leading to very high level management by exception of a mission.