Model-Based Definitions

White Paper by Bernhard Rumpe, Judith Michael, Software Engineering, RWTH Aachen, May 2023

It is always relevant to have a precise and commonly shared understanding of the relevant concepts. Here come ours on modeling – compact, not many explanations, no examples. For those extra materials you may look at our publications.

Core Definition

Definition: Model

A model is characterized through three properties:

  • A model describes some original, real system/entity that exists or is intended to exist in the future.
  • A model is an abstraction of the original.
  • A model has a purpose with respect to the original.

(based on Stachowiak [Sta73])

For a detailed discussion of the possible implications of this definition, please consult our various publications. Highlights are, e.g., the discussion on semantics of models [HR04], the way how to construct modeling languages using the MontiCore language workbench [HKR21], and especially the discussion on underspecification as an intrinsic part of models as opposed to programs [RW18].

Classification of Model Kinds

Models can be classified in various ways, as shown below, even though not all classifications are equally relevant.

Which scope a model describes
  • System model describes the overall system.
  • Test model describes one or several tests and is used for testing.
  • Component model describes a system component.
  • Context model describes the context of a system or component under consideration. It is not to be confused with “context condition” introduced below. The context model is dependent on the scope of the development, i.e., which system components belong to the context. When the scope changes, e.g., to the next level sub-system, then system components and their models become context elements.
Which aspect of a system a model describes
  • Behavior model, structure model, architecture model, interaction model, state model, communication model, data model, etc., are models that serve the purpose of describing a respective aspect of a system or component.
In which development activity/phase a model is defined and used
  • Requirements model, functional model, logical model, technical model, and implementation model, but also architecture model, test model, etc. are models that are mainly developed and used in the respective activity of a development project.
  • Note that many models are used for other purposes, e.g., scientific models or data models.
Form and representation of the model
  • Conceptual model is a representation of entities and their relationships. This is a broadly applicable term for many kinds of models. It is prominent in the database and the ontology/logical communities and very much depends on the term “entity”.
  • Physical model is a physical, touchable model (it hurts when it falls on your feet).
  • Thought model (mental model) is the mental construction of a person to gain an understanding of the system of interest. It is argued that any explicit model is translated by the reading person into an internal thought model.
  • Paper model (blackboard model) is an informal drawing denoted on paper, a wall, or a board.
  • Explicit model is a model denoted in an explicit modeling language, such as UML, SysML, or a domain-specific language.
  • Diagram is an explicit model denoted in a diagrammatic modeling language, such as UML’s class diagrams or Statecharts.
  • Text model is an explicit model denoted in a textual modeling language, such as UML’s OCL or MontiCore’s CD4A.
  • Programmed model is an implicit model that is encoded in an executable program. The model is hidden in the code and mainly usable for simulation. The programming language is not to be confused with the (nonexisting) modeling language.
How the model came into existence
  • Designed model is a model that has been created through human activity.
  • Learnt model is a model that has been constructed from data, e.g., using one of the various machine learning techniques.
  • Derived model is a model that has been derived by some algorithm, e.g., through abstraction, composition, transformation, refinement, and other algorithmically assistable derivation techniques.
What the model is used for

A similar categorization that however addresses a mixture between usage and origin:

  • Prescriptive model describes the system under development before the system actually is developed. This is typically a designed engineering model or it has been derived from designed models.

  • Descriptive model describes a system that already exists. This is typically a designed scientific model, but also an engineering model, when the system has been realized. Descriptive engineering models allow to understand examine and optimize the already built system.

  • Prescriptive model also describes a system that already exists, but it has its origin in observations on the system. It is learnt from these observations and related forms of data and has a rather implicit form, so that it cannot directly be examined, but executions, respectively queries produce answers, mainly to predict future behavior. Machine learning is based on statistic correlations and does not necessarily imply causality in such as form as covered by descriptive models.

What the model is applied on
  • Engineering model (Development model) is a model that is used in a development project and denotes the system under development. This includes software, biological, mechanical, chemical systems as well as social, economic, or producing processes.
  • Scientific model is a model that describes real-world phenomena and is used to understand the existing world and its processes.

“When the model doesn’t fit the modeled system, then the scientist says that the model is incorrect, while the engineer says the system was badly constructed.” (well known, true joke). We, however, observe models may serve both purposes because, e.g., biological science models may become bio-engineering models. In Computer Science, the situation is even more complicated because the existence of generators may change the purpose of a model from (scientific) understanding to constructive use and even to the implementation itself.

  • Metamodel
    is a model describing some kind of meta-structure. A common use is to define the abstract syntax of a modeling language, which makes each model (or. to be exact: the model’s AST) an instance of the metamodel. The MOF provides a hierarchy of metamodels, where finally, the meta-meta-meta-model is an instance of itself.

In the domain of language engineering, we thus abandon fuzzy forms of uses and only use the more narrow definition:

Definition: Metamodel (syn.: Meta-Model)

A metamodel is a model describing the abstract syntax of a language.

(definition from [CFJ+16])

Language

Definition: Software Language

A software language is a human-readable and computer-processable language addressing a particular problem.

Programming languages, configuration languages, but also modeling languages are such languages that are understandable for both kinds of stakeholders, namely humans and computers.

Definition: Modeling Language

A modeling language defines a set of models that can be used for modeling purposes. Its definition consists of

  • the syntax, describing how its models appear,
  • the semantics, describing what each of its models means, and
  • its pragmatics, describing how to use its models according to their purpose.

(definition from [CFJ+16])

UML and SysML are prominent modeling languages with relatively general applicability in various domains.

Definition: Domain-Specific Language (DSL)

A domain-specific language (DSL) is a software language specialized for a particular application domain.

E.g., the MontiCore language workbench provides techniques to define domain-specific languages and offers a set of predefined domain-specific languages.

Parts of a language

A language is described by the following concepts:

  • (1) Syntax (“representation” and “structure”)
  • (2) Semantics (“meaning”)
  • (3) Model library (“vocabulary”)
  • (4) Pragmatics (“forms of use”)

The syntax of a language can be divided into the following aspects:

  • (1.1) Concrete syntax
  • (1.2) Abstract syntax (AST)
  • (1.2) Context conditions (CoCos)
Syntax

Definition: Concrete Syntax

The concrete syntax of a modeling language is used to describe the concrete representation of the models and is used by humans to read, understand, and create models. The concrete syntax must be sufficiently formal to be processible by tools.

(definition from [CFJ+16])

Definition: Abstract Syntax

The abstract syntax of a modeling language contains the essential information of a model, disregarding all details of the concrete syntax that do not contribute to the model’s purpose. It is of particular interest for use by software tools.

(definition from [CFJ+16])

The abstract syntax (AST) of a model is then a concrete value of the AST of the language.

The abstract syntax is usually defined using context-free grammars or class diagrams (also called “metamodels”). Both notations are limited in their expressiveness, and thus, it is usually necessary to add additional constraints to restrict the set of models to define the well-formed models:

Definition: Context Conditions

Context conditions are boolean predicates constraining the set of models to the subset of well-formed models.

Context conditions are thus also called well-formedness rules. Sometimes also, “static semantics” is used, but we regard this term as misleading because context conditions do not tell us anything about the meaning (i.e. semantics) but purely constrain the syntax.

Semantics

Definition: Semantics

The semantics of a modeling language captures the essential information of its models in the form of an explicitly defined

  • syntactic domain that describes all well-formed models,
  • a semantic domain that captures all essential information that the models can describe, and
  • a semantic mapping that relates the syntactic constructs of the models to the semantic domain.

(definition from [CFJ+16])

Special forms of semantics are (1) denotational semantics, (2) axiomatic semantics, and (3) operational semantics using different techniques and serving different purposes. We only look further at denotational semantics. It describes what a model means without talking about how this meaning is actually achieved. Denotational semantics is typically defined using mathematical constructs. As a result, it is quite abstract and typically requires some training in mathematics to understand. Its advantage is that because we have the full power of mathematics at hand, we do not have to suffer from computational limitations and can, e.g., adequately address the principle of underspecification.

Model Library

Definition: Model Library

A model library is a collection of relatively independent models that are

  • defined for use in other models through language-specific forms of import (reference), refinement, or adaptation and
  • that can be composed with other models.

The existence of a library of reusable models is one key success factor for a language to become used in practice. Models only become reusable as independent, modular artifacts that can be somehow “imported”, “waved”, “merged”, or otherwise composed if appropriate operators exist. While modularity and, thus, extensive libraries are a standard in modern programming languages, many modeling languages suffer from inadequate or even missing composition operators and missing reusable libraries. Sometimes this is due to a lack of modularity in the modeling language itself.

Tools

Definition: Tool

A (software) tool is a program that is employed in the development, repair, or enhancement of software programs.

Software tools can assist developers in all activities of all phases of a software life cycle, including management, quality assurance, requirements elicitation, design, implementation, integration, deployment, evolution, optimization, and automation activities. Software tools can also be used during the operation of the developed software program for its monitoring, debugging, repair, configuration, or other forms of analysis.

Definition: Language Workbench

A language workbench is a tool that provides a set of techniques supporting the development and evolution of a language and its associated tooling, including design, implementation, deployment, evolution, reuse, and maintenance.

MontiCore, as described in [HKR21], is such a language workbench.

A language workbench can also be called a “meta-tool” as its focus are tools as products. A language workbench can also be used to recursively develop parts of itself.

Tools can be categorized according to the following options:

  • An API-based tool is essentially a set of methods that can be called to execute the functionality within an already existing process. The set of methods is specific to the tool’s functionality.
  • A command-line tool is an executable artifact that can be called from a command-line shell or a script and runs in its own virtual process space.
  • A plugin is a special form of API-based tool, where the API is predefined by an integrated development environment (IDE), in which a tool is pluggable when it realizes that API. The tool’s API is IDE-dependent but rather independent of the tool’s functionality.

Accordingly, there are several kinds of sources used, and results tools can produce:

  • The data may be available as an object structure within the program, which may be manipulated directly (API-based tools only).
  • The data may be available as an artifact (file) in the directory and thus must be loaded or parsed into an internal representation.
  • The data may be stored in a database and needs retrieval.

Although every set of methods might be regarded as an API and, therefore, as an API-based tool, we suggest that certain forms of coherent functionalities with adequate granularity should be called a tool. In practice, many tools come with both API and CLI interfaces because, typically, a main class method translates external invocation to internal API calls.

Definition: Tool Workflow

A tool workflow is a combination of tool executions in a predefined order used to produce certain results in a fully automated form.

Development workflows, in general, may also involve developers, testers, or lead users, and there are some development processes where these workflows are made explicit. A tool workflow focuses only on the automation aspect, allowing the workflow to be executed fully automatically. This may happen upon each relevant software change (e.g., version commit), daily overnight, when certain stages, e.g., release, are reached, etc. Essentially a dual workflow can be defined using a purely sequential script, but various efficiency optimizations are very helpful:

  • (1) the workflow should be executed incrementally, where only the necessary actions are repeated, and earlier intermediate results are reused,
  • (2) the users don’t have to care about the action dependencies, but the workflow manager itself knows how to sequentialize the defined actions and also when actions can be executed in parallel.

Gradle and make are such workflow managers if configured well.

Model Engineering and its Activities

Definition: Model Engineering

Model Engineering comprises all activities around the creation, transformation, analysis, and management of models and their related artifacts with the purpose of assisting software and systems development or real-world understanding.

To better support the use of models during development, it will become necessary to further understand models in terms of how they can be transformed and analyzed, as well as understand the impact of software size and complexity on model creation, management, and evolution. We expect that “Model Engineering” will become a new and highly interesting sub-discipline in the Software Engineering discipline.

As a consequence: Model Engineering might be considered a new subdiscipline of Software Engineering.

The following definitions describe a number of operations on models.

Forms of Model Operations and Relations

Model Transformation

A model transformation applies to one or more source models and transforms them into one or more target models.

(definition from [CFJ+16])

Refactoring

Refactoring is a relation between two models. Model B is a refactoring of model A if all information that model A contains about the modeled system can also be derived from model B, but not more.

(definition from [CFJ+16])

Refinement

Refinement is a relation between two models. Model B is a refinement of model A if all information that can be derived from model A can also be derived from model B.

(definition from [CFJ+16])

While refactoring is typically regarded as “horizontal”, i.e., remaining on the level of abstraction, refinement is regarded as “vertical”, i.e., downstream from abstract requirements to concrete realization. Refactoring enables evolution.

Refinement is only possible if the modeling language itself provides forms of abstractions, which allow the omission of details, but also through explicit underspecification by using special language constructs.

Model Composition

Given two independently defined models, model composition is an operator to combine both models into one, such that their individual semantics is retained and certain details of the models are encapsulated.

(definition from [CFJ+16])

Composition of models enables developers to decompose a problem and solve the resulting smaller problems separately, in parallel, and in a modular form. Even more, composition of models enables reusing predefined solutions, i.e. through reuse of models for example from a library.

It is noteable that composition may not only be applied to models. Related, but also to some extent orthogonal are component composition, tool composition, tool workflow composition, and language composition.

Development Activities

Code Generation

Code generation is the process of translating a source model into artifacts of an executable language.

Model Analysis

Model analysis comprises all activities

  • to understand or query the properties and consequences defined in a model,
  • to ensure the model quality, and
  • to understand the properties of the modeled software, system, or real world.

Code generation comprises all constructive activities, including the generation of tests, reports, documentation, and configurations, while model analysis comprises all analytic activities, including static analysis with formal methods on certain possible deficiencies, simulation of executable models for analyzing dynamic behavior in a relevant subset of scenarios, or testing. We distinguish high-level, complex analysis, e.g., with metric results from simple context conditions, which at this point have already been checked.

Model analysis deals with the questions of

  • (1) whether the model itself is of high quality,
  • (2) whether the model fits the modeled software, system, or real world, and
  • (3) what analysis techniques must be provided for developers to enable them to retrieve useful information from the model.
Development Processes

Development activities are organized in development processes, which are classifiable according to the following definitions.

Model-Based Software Engineering (MBSE)

Model-based Software Engineering (MBSE) is the use of explicit models and modeling techniques to support software requirements elicitation, design, analysis, testing, integration, verification, validation, configuration, and deployment activities beginning in the conceptual design phase and continuing throughout the development and later life cycle phases.

(our definition)

Model-Based Systems Engineering (MBSysE)

Model-based Systems Engineering (MBSysE) is the formalized application of modeling to support system requirements, design, analysis, verification, and validation activities beginning in the conceptual design phase and continuing throughout development and later life cycle phases.

(from INCOSE SE Vision 2020)

  • In Model-based development (MBD), models are used in some activities of development, but the activities and their order basically remain the traditional development activities.
  • In Model-driven development (MDD), models also drive and guide the development process, and thus, models are the primary engineering artifacts.

Model-driven development is much more helpful because the use of models must drive the process in such a way that other activities can be omitted or at least considerably reduced, such that the overall process becomes more efficient, more timely, and produces better results.

  1. [HKR21]
    K. Hölldobler, O. Kautz, B. Rumpe:
    Aachener Informatik-Berichte, Software Engineering, Band 48, ISBN 978-3-8440-8010-0, Shaker Verlag, May 2021.
  2. [RW18]
    B. Rumpe, A. Wortmann:
    In: Principles of Modeling: Essays Dedicated to Edward A. Lee on the Occasion of His 60th Birthday, Lohstroh, Marten and Derler, Patricia Sirjani, Marjan (Eds.), pp. 383-406, LNCS 10760, ISBN 978-3-319-95246-8, Springer, 2018.
  3. [CFJ+16]
    B. Combemale, R. France, J.- M. Jézéquel, B. Rumpe, J. Steel, D. Vojtisek:
    Chapman & Hall/CRC Innovations in Software Engineering and Software Development Series, Nov. 2016.
  4. [HR04]
    D. Harel, B. Rumpe:
    In: IEEE Computer Journal, Volume 37(10), pp. 64-72, Oct. 2004.
  5. [Sta73]
    H. Stachowiak:
    Springer Verlag, Wien, New York, 1973.