User Tools

Site Tools


basic_principles

Basic Principles of Gellish

This page describes a number of basic principles that are applied in Gellish English.
Gellish is not intended to become a natural language interpreter, but it is oriented towards databases and data exchange messages between computers. Conventionally information analysts, data modelers and programmers do not use the term 'language' for their data models, although a conventional data model define expressions structures and terminology, which together in fact defines some form of 'dedicated languages'. Therefore, the formal languages of the Gellish family do not have the free form that natural languages have, but the expression structure is in essence a (database) table. Nevertheless, the core elements of each line in such a table can be read as a (nearly) normal natural language expression.

Gellish can thus be described starting from a conventional data model and generalizing such model towards the flexibility and capabilities of natural languages. But it can also be described starting from natural languages and adopting simplifications and constraints on the allowed expressions to a level that computer software is able to interpret the meaning of the expressions, apply logic and to act accordingly.
Starting from a natural language, Gellish adopted the following simplifications:

  • Split expressions in collections of binary relations ('basic semantic units'), because in the target language usage (business and technical data processing) every idea can be expressed as a collection of one or more binary relations between related objects.
  • Each binary relation shall be classified by a kind of relation that is predefined in the formal dictionary. The meaning of a kind of relation is independent of the sequence in which the related objects are arranged in the expression (at the left or at the right hand of the kind of relation). Because kinds of relations are usually denoted by a phrases in natural language, each expression has a reading direction. Because the inverse relation is equivalent, there are always two 'phrase types'': a base phrase and an inverse phrase for cases where the related objects appear in the reverse order. As a consequence, the kind of relation defines the first kind of role and the second kind of role, but allows for base phrases as well as inverse phrases for expressing an idea. (N.B. The phrase type UID represents this order in a language independent way).
  • Thus the core of each binary relation consists of two related objects, a kind of relation, a phrase type and optionally the two roles or kinds of roles that are played by the related objects.
  • Separate concepts from terminology. This means that each idea, relation, related object or role and each of their kinds is represented throughout the family of languages by a natural language independent unique identifier (UID), whereas multiple names, aliases and translations may denote those UIDs in various 'language communities'. This enables the unambiguous use of synonyms as well as homonyms.
  • Make intentions of expressions explicit by separating the theme of an expression from the explicit intention with which an idea is expressed. This means that grammatical variations in expressions between questions, statements, confirmations, denials, etc. are largely eliminated, while maintaining the variety of expressions. This supports computer dialogues instead of the common limitation to statements and facts (this follows from the 'Speech act theory' of John Searl).
  • Replace past and future tenses by a uniform expression accompanied by explicit expression of time indications where necessary. These largely reduce that grammatical variety of expressions about was has been the case and what may become the case.
  • Use only concepts that are selected from the taxonomic dictionary of the formal language, or add them and/or their aliases according to the rules for extension of the dictionary.
  • Replace plural names, through using singular terms where possible in combination with explicit cardinalities and numbers of items in collections. This eliminates nearly all plural terms from the dictionary of the formal language.
  • Add explicit contextual information to each basic semantic unit. This eliminates the problem of context dependence of interpretations.

1. Expression of facts by relations

A core principle of the Gellish language is the observation that any idea can be expressed as a collection of one or more binary relations, being relations between two objects. This is the ORO principle: the Object-Relation_type-Object principle. It is supported by the observation that complex ideas can be expressed as collections of binary relations, while adopting the complex idea an an object in its own right. For example, activities and processes are interactions between multiple objects, whereas they can be modeled by a collection of binary involvement relations, each of which expresses the way in which an object is involved in the activity or process.
Furthermore, it is recognized that in a binary relation each of the related objects plays a particular role of a particular kind. Therefore, an extended expression of a fact in Gellish includes those (kinds of) roles and thus becomes: Object-First_role-Relation_type-Second_role-Object. For example, the fact that ‘the Eiffel tower is located in Paris’ is expressed as relation between the Eiffel tower and Paris, which relation is classified by the standard Gellish relation type called <is located in>. Furthermore, the Eiffel tower plays the role of located and Paris plays the role of location in that relation.
There are a large number of such standard kinds of relations defined in the Gellish language. Those kinds of relations form the basic grammar of Gellish and the fact that they are standardized make that software can build on their meaning, which makes the language computer interpretable. The standard relation types are defined in the Gellish English Dictionary, together with the other dictionary concepts and terms. The number of kinds of relations and other concepts is not limited to the current dictionary, because the Gellish Modeling Methodology includes a specification of how the dictionary can be extended with additional concepts and relation types. For example, if the Eiffel tower or Paris or an appropriate relation type would not yet exist in the Gellish English Dictionary, then they can be added in a Domain Dictionary or as proprietary extensions. All the Gellish expressions can be stored and exchanged using the standard Gellish Expression format.

The principle that each idea can be expressed as a collection of binary relations is shared with e.g. the NIAM and the Object Role Modeling (ORM) methods. A difference between NIAM, ORM and Gellish is that Gellish classifies the relation and distinguishes the relation from the two roles of the related objects, whereas ORM only recognizes the two roles and does not recognize the relation as a separate object. Furthermore, a more important difference is that neither NIAM nor ORM define standard kinds of roles (or kinds of relations).

2. Use of a standard dictionary

Conventional methods for data modeling leave it in the freedom of developers to create their own entity types, attribute types or object types and relation types and to allocate their own names to them. Developers usually put more constraints on the (names of) allowed values that may be entered as 'instances' when the data models are populated by data entry in the systems. However, those allowed values are usually not coordinated across multiple systems, so that also the content does not comply with some standard formalized language.
This implies that those methods in fact encourage everybody to create his or her own language (dictionary as well as grammar). This is the root cause of the difficulties with exchanging data between systems and with integrating data that originate from different databases.
To address this issue, a basic principle of Gellish is that all Gellish users shall in principle select their concepts from the predefined Gellish English Dictionary and also use the predefined names, or specify their aliases explicitly. This common use of the same concept definitions, together with a proper management of unique identifiers, enable that collections of Gellish expressions from different sources can be integrated without conversion and enables that data that originate from one system can be generated or interpreted by any other system that is Gellish enabled.
Furthermore, in addition to the common use of the same dictionary, every user can create his own individual objects (conventionally referred to as 'instances') and whenever concepts that are needed by user are missing in the Gellish English Dictionary it is recommended to create proprietary subtype kinds of objects or relation types . Users are also recommended to propose definitions of new concepts as extensions of the formal Gellish English Dictionary, so that more concepts definitions become shared between Gellish users.

Thus Gellish English is not just a modeling method, like e.g. NIAM, ORM, ER and other methods, but it is a complete formal language, including a normal, although enhanced, electronic English (taxonomic) dictionary.

3. Explicit classification of individual things (instances)

Another core concept of Gellish is that every individual thing (or instance) that is used in Gellish expressions needs to be classified by an explicit classification relation with an existing concept (class) in the Gellish English Dictionary or by a concept that is defined as a proprietary extension of that dictionary. If it is required for such a classification to define a proprietary extension of the Gellish English dictionary, then such a new concept should be added according to the rules for a proper definition, which includes defining the concept as a subtype of an existing supertype concept.

This concept for Gellish Databases is different from conventional databases. In conventional databases every instance is implicitly classified by the definition of the attribute type (the definition of a database table column) of which it is an instance. Thus, the kinds of instances that can be stored in a conventional database is limited by the (fixed) number of entity types and attribute types or object types that are defined in its data model. In Gellish English the explicit classification implies that anything can be stored in a Gellish Database, because every concept in the Gellish English dictionary can be used to classify an instance and missing concepts can be added without modifying the database. This means that Gellish English is equivalent to an unlimited large data model.

4. Standard data structure: the Gellish Expression format

The structure of Gellish expressions (its syntax) is intended to enable the expression of ideas of any kind. The development process of various ISO standards has revealed that there appears to be a generic structure that is suitable for expressing any kind of idea. Thus, every expression has basically the same generic structure, although there are a large number of optional elements. Furthermore, it appears to be possible to model that structure in a single table. As a result, the Gellish Expression format consists of only one precisely defined table. The core of the structure is illustrated by the examples in Table 1.

Name of left hand object Name of the kind of relation Role of the right hand object Name of right hand object Unit of measure
the Eiffel tower is classified as a tower
Paris is classified as a city
the Eiffel tower is located in Paris
tower can have as aspect a height
Paris is a part of France
the Eiffel tower has as aspect a the height of the Eiffel tower height
the height of the Eiffel tower has on scale a value approximately equal to 300 m

Table 1, The Gellish Expression format with example expressions of ideas

Each line in Table 1 is the expression of a single idea by a binary relation or a combination of binary relations. By default the expressions are interpreted as statements. Each related object plays a particular role or kind of role in a relation of the specified kind. That role or kind of role is determined by the definition of the kind of relation and may be further defined by the related objects. In most cases the roles and kinds of roles remain implicit. For example, in Table 1, the Eiffel tower and Paris are objects that play different roles in the relations of the various kinds.
For example the Eiffel tower and Paris play respectively the (implicit) roles 'located' and 'location' in the third relation. That relation is classified by a kind that is denoted by the phrase ‘is located in’. The Gellish language contains proper definitions of standard relation types, such as 'is located in'. Those definitions of kinds of relations include the proper definition of those kinds of roles as well as what kind of things may play roles of such kinds. These definitions enable software to some extent to execute semantic verification of the semantic correctness of the expressions.

The lines that classify ‘Eiffel tower’ as a ‘tower’ and ‘Paris’ as a ‘city’, are examples of classification relations, which relate individual things to known concepts (kinds of things) in the Gellish dictionary. Those classification relations add those individual things as new concepts to the dictionary.
Note that tower and city should be standard concepts that exist already in the Gellish English Dictionary.

The line that specifies that the Eiffel tower <has as aspect a> height, called 'the height of the Eiffel tower' shows the use of an additional column in the table to introduce the intrinsic aspect. It should be noted that this statement can be expressed without the need to also have the statement on the fourth line that states that a tower can have a height (meaning that it can have a value for a height in an information model; in reality it has by definition some height). The language definition does not put any constraints on which object can have which kind of aspect.
On the last line the (intrinsic) aspect is is quantified on a scale by an approximate value, whereas the relation is classified not only as an approximate equality of a scale, but is also classified by the quantification method, being the meter scale.
Note that the usage of separate lines for quantification of aspects enables to allocate different values for different moments in time, and in different units or different measurement accuracies or various kinds of relations, such as equality, greater than, less than, etc.

The Gellish expression format defines a number of other contextual binary relations that can be represented in additional column on the same line in the standard table. For example about the author, date of creation etc. The whole format is defined in the document The Gellish Syntax and Contextual Facts. The definition of the syntax/format allows for the usage of user defined subsets of binary relations and thus of subsets of columns in expression format tables. It also does not prescribe a particular encoding, although Unicode (utf-8) and CSV or JSON are recommended.
Note that Gellish does not prescribe the use of the Gellish Expression format. The separate binary relations that are combine on one line in such a table can also be expressed in other formats, such as RDF or XML.

5. Use of standardized kinds of relations

The various kinds of relationships are standardized in the Gellish Dictionary and the kinds of relations are the core elements that determine the expression power of the formal language. Expression in Gellish are only correct Gellish expressions if the use kinds of relations that are selected from the Gellish Taxonomic Dictionary. The kinds of relations form a specialization hierarchy (subtype-supertype hierarchy) of kinds of relations to ensure consistency of the language. That hierarchy enables that software can also search for expressions that use more specialized subtypes of kinds of relations. For example, software can automatically search for things that are connected to each other by any kind of connection, but it can also search for welded connections only. The kinds of relations are further defined in the dictionary (ontology) by the kinds of roles that they require. Those roles are also explicitly defined and arranged in a specialization hierarchy that is compliant with the hierarchy of the kinds of relations. Finally it is defined which kind of objects can play roles of those kinds, whereas those role players are also defined and arranged in a specialization hierarchy. Together these definitions and hierarchies enable Gellish powered software to verify the correctness of Gellish expressions and the consistency of the use of the language and it enables the application of logic reasoning during the search to answer questions and queries.
Some examples of important binary kinds of relations with rather trivial kinds of roles are:

  • A composition relationship, with the roles ‘part’ and ‘whole’.
  • A classification relationship with the roles ‘classifier’ and ‘classified’.
  • A specialization relationship with the roles ‘subtype’ and ‘supertype’.

The kinds of roles determine the kinds of things that are suitable to play those kinds of roles, because only specific kinds of things can play specific kinds of roles.
For example:

  • Each ‘part’ role and each ‘whole’ role in a composition relationship can be played only by an individual thing.
  • Each ‘classified’ role in a classification relationship can be played only by an individual thing and each ‘classifier’ role can only be played by a class (being a kind of thing).

It is important to note that the natural language phrase representing the kind of relationship determines which kind of role acts as the first role and which kind of role acts as the second role. For example, assume there is an idea that can be expressed in Gellish by the assembly relation on the first row in Table 2, then that expression is equivalent to the expression of the idea on the second row in Table 2 and actually expresses the same idea, but in an inverse expression. According to the above normal English convention both expressions imply that object A has a role as part and object B has a role as whole in a relation that is classified as an assembly relation which is also called a ‘part-whole’ relation.

UID of idea Name of left hand object Name of kind of relation Name of right hand object
1 A is a part of B
1 B has as a part A

Table 2, Roles of objects in relations and in inverse expressions.

The most generic kind of relationship in Gellish English is simply called ‘relation’ or ‘might be related to’. That concept forms the top of the subtype-supertype hierarchy (taxonomy) of kinds of relations. The definition of that ‘relation’ concept is defined in Gellish by expressions of as many pairs of elementary facts as the order of the relation type. As an example, the first four rows of Table 3 show Gellish expressions that define the concept 'relation' by the expression of two pairs of elementary facts. Note, by the way, that we use the terms relation and relationship as synonyms.

UID of idea Name of left hand object Name of relation typeName of right hand object
1 relation has by definition as first role a relator
2 relator can be played by a anything
3 relation has by definition as second role a related
4 related can be played by a anything
5 is a part of is a base phrase for assembly relation
6 has as a part is an inverse phrase for assembly relation

Table 3, Definition of a kind of relation

The example of the definition of a kind of relation on the first four lines in Table 3 is representative for the definition of any kind of relation, provided that each concept in such a definition is itself also defined by a subtype-supertype relation with its supertype concept(s) in which also a textual definition is provided. This in the above example, the concepts relation, relator, related all shall be defined as being a kind of some supertype kind.

6. Use of synonyms, phrases and inverse phrases

Gellish enables the definition of synonyms, abbreviations, codes, etc. for names of concepts as well as for names of kinds of relationships. Each term and each alias is defined to belong to the vocabulary of either 'international', being the international community, or to a natural language and within that to a 'language community' such as a discipline, a standard or an organization. Standard synonyms are defined in the Gellish Dictionary, but users can define their own terminology as alias that is specific for and the preferred term for their organization. This means that their organization is specified as the language community where the term has its base. The same holds for kinds of relations, which are denoted by natural language ‘phrases’, such as the phrases in the column 'name of kind of relation' in Table 3.
The way in which synonym phrases are defined for the relation types in the Gellish English dictionary is illustrated on row 5 in Table 3. Gellish English has also defined inverse Gellish phrases that imply that the left hand and the right hand related objects are inversed. Row 6 of Table 3 shows an example.
Translations can be defined in the same way as aliases, but for the definition of bulk translations, the Gellish Expression format allows for additional columns with special IDs that are reserved for terms in a specified language.

7. No separation between meta model and instances

Conventional data modeling methods separate meta models from instances. In fact those methods use two distinct languages: one for the specification of the meta model (such languages are called data modeling languages), the data model defines concepts such as entity types and attribute types with their names as the vocabulary of the data model language. The data model of the application domain acts as a framework for the interpretation of another language, being the terminology in the language of the users.
In Gellish such a separation does not exist. Gellish can be used for both kinds of modeling activities: data or knowledge modeling as well as creating information models of individual things (instance models). The only difference between those two applications of Gellish is that relations between kinds require other kinds of relations than relations between individual things or relations between individual things and kinds. The above Table 1 and 2 provide examples of relations between individual things, where the classification relations cross the bridge between the world of individual things and the world of the kinds of things (the Dictionary). Table 3 and 4 illustrate kinds of relations that are used for expressing ideas about kinds of things. For example, relations that are used for the description of functions. Table 4 illustrates the usage of Gellish phrases for relations that express an idea that states that an engine has by definition an ability to drive (a pump) and can be performer of a process of driving a pump, in which process a pump is a subject.

Unique fact ID Name of left hand object Name of relation type Name of right hand object Description
1 engine has by definition as aspect a ability to drive with roles possessor and possessed
2 engine can be a performer of a driving a pump with roles performer and performed
3 pump can be a subject in a driving a pump with roles subject and subjecting

Table 4, Relations used to describe functions

Further relation types are defined in the part of the Gellish Taxonomic Dictionary that contains the collection of expressions in the base upper ontology.

Continue with Gellish Modeling Method
Return to Overall Table of Content

basic_principles.txt · Last modified: 2017/11/15 11:15 (external edit)