In the field of automation systems, the trend towards more and more
complexity persists. Being able to handle this complexity is going to
open up many opportunities. It is a characteristic of these
application areas that the system designers are faced with strong
nonlinearities, time variance and a tight integration into natural and
safety critical environments. In order to reduce engineering efforts
and/or to enable a flexible adaptation to (changing) environments,
computational intelligence and machine learning are applied to provide methods
for controlled, i.e. safe self-optimization.
For this purpose we use the Organic Robust Control Architecture (ORCA), which is a specific variant of the O/C (observer-controller) architecture. Within ORCA, we use so called SILKE templates providing a second level observer-controller structure.
As an example the video shows successful online learning of balancing an inverted pendulum on a real system. In this case, the system learns to swing up and to balance the pendulum in just a few seconds. Please note that this real pendulum cart features some hard physical effects like backlash and slippage. Additionally, a heavy cart (2.3 kg) has to be moved to balance a pendulum of only 80 grams.
The focus of the SHARCS research line (Self-optimizing Heuristic And
Reliable Control Systems) is thus to fundamentally study, extend and
demonstrate computational intelligence and online machine learning methods within appropriate
practically relevant scenarios. The acronym SHARCS is, of course, supposed to reflect the
main issues of this research line:
- Self-optimizing control systems: Modern control systems have to deal with scenarios which change their key characteristics in a way which is impossible to predict at the design time. Our approach is to design control systems with a much higher flexibility compared to conventional control systems. This flexibility is achieved by the use of online machine learning techniques throughout the control system's whole lifetime, especially during its operational phase.
- Heuristic control systems: In contrast to conventional design principles, our approach is to describe the desired behavior of a control system in a model-free way by heuristic rules. In combination with self-optimization, this allows to reduce the total engineering effort and at the same time to keep the self-optimized behavior interpretable. This offers a bidirectional tracebility which is essential for having clear liabilities in the case of an accident involving self-optimizing systems.
- Reliable control systems: In order to guarantee in advance that a self-optimizing control system will act appropriately for any given situation, additional means have to be taken. Our approach is to guide the process of self-optimization at runtime, i.e., to exploit and guide the flexibility of online learning in a suitable way. By this, self-optimizing control systems can achieve a high trust in a constructive way.
The following methods are developed within the SHARCS research line:
- Directed Self-Learning (DSL): DSL is our main learning architecture. It uses a Takagi-Sugeno fuzzy system, which contains heuristic rules, to directly control the underlying plant, and an application specific law of adaptation to generate incremental learning stimuli at runtime depending on the system operation. These learning stimuli are used to self-optimize the rules within the fuzzy system.
- System to Immunize Learning Knowledge-based Elements (SILKE): The SILKE approach is our main method for guiding self-optimizing systems. This is done by continually enforcing appropriate meta-level characteristics of the learned rules by so called SILKE templates. Examples include local smoothness or local gradients of the control behavior.
- Online Diagnosis for Incremental Learning (ODIL): The ODIL approach is a variant of the SILKE approach as it uses similar local templates. The idea is to detect local anomalies of the self-optimization process as such, e.g. that something is not compliant with desired meta-level characteristics. These anomalies represent that something is not working as intended and thus form a suitable interface for countermeasures determined within the context of our TIGERS research line.
- Flexible Rate Adaptation for (Neuro-)fuzzy Control Applications (FRANCA): Every machine learning method requires a trade-off between stability and plasticity/flexibility. But for control systems, this trade-off depends on the situation, i.e., it is time-variant. To address this issue, we are developing the FRANCA approach which extends the DSL learning architecture (see above) in order to dynamically tune the stability-plasticity-trade-off to the given situation.
- Exploiting Learning stimuli in Interrupted SElf-optimization (ELISE): For some application scenarios, e.g. hybrid systems, it is advantageous to switch between different controllers depending on the current situation. If some of these controllers are self-optimizing, their self-optimization processes are disrupted and can be severely disturbed by this switching. The ELISE approach aims at providing guidance of the learning process in such scenarios. The key idea is again to detect anomalies and to generate additional learning stimuli as a countermeasure.