Gellish Wiki

This is an old revision of the document!

1. Gellish Database Design

Each Gellish Database consists of one or more Gellish Database tables. Each of those Gellish Database tables has basically the same structure and is standardised and is application system independent. This is different from conventional databases that usually have proprietary data structures, and that have database tables that are all different. Each of the Gellish Database tables shall contain at least the obligatory columns of one of the subsets of columns that are defined in the Gellish Database Definition document, which is summarised below.
The content of Gellish Database tables shall be compliant with the grammar and the dictionary of the formal Gellish English language (or a Gellish variant in any other natural language). The standardised tables, combined with the formal Gellish language enables to combine an arbitrary number of Gellish Database tables into one Database. Furthermore, such a database might be centralised, but can also be a distributed database. This also enables to combine the results of a Gellish query to various independent data stores, which then act as a distributed database.
The various Gellish Database tables all have the same core of column definitions. Apart from that core, the tables may also have one or more of the optional columns. Preferred collections of columns are defined in standard Gellish database table subsets.

A Gellish Database may be implemented in various formats. It can be in the form of an SQL database, or in XML, or even in XLS (the form of Excel spreadsheet tables).

2. Limitations of conventional databases

Conventional databases typically consist of many tables, each of which is composed of a number of columns. The definition of those tables and columns determine the storage capabilities of the database, whereas the relations between the columns define the kinds of facts that can be stored in such a database. Those columns and relations determine the database structure that defines the expression capabilities of the database. Similar rules apply for the structure of data exchange files and thus for the information that is exchanged in electronic data files.
This conventional database technology has some major constraints:

When data was not covered during the database design and thus is not included in the data model, then such data cannot be stored in the database nor exchanged via such a data file structure.
Different databases have different data structures, which causes that data in one database cannot be integrated with data from other databases nor exchanged between databases without dedicated data conversion.
A database modification or extension requires redesign of the database structure, modification of software and data conversion, which makes it a relatively complicated and costly exercise.

Another characteristic of conventional databases is that there are hardy international standards available or used for the content of the databases, being the data that is entered by its users. This typically means that local conventions are applied to limit the diversity of data that may be entered in those databases. As local conventions usually differ from other local conventions this has as disadvantage that data that are entered in one database cannot be compared or integrated with data in other databases, even if those database structures are the same and even if the application domain of the databases is the same. For example, within a company there may be various implementations of the same system in various sites for the storage of data about equipment, whereas for example the performance data about the same type of equipment still cannot be compared with the performance data in another location, because the equipment types have different names and the properties are also different.

3. Characteristics of a Gellish Database

A Gellish database does not have the semantic limitations that conventional databases have, because of the flexible and open Gellish language and because of its standard universal data structure (grammar), which is simple, computer and human interpretable. A Gellish database consists of one or more database tables, each of which has the same table structure (column definitions). The fact that those Gellish Database tables are standardised and universally applicable makes a Gellish database application independent. A standardised Gellish database table is universally applicable because it enables the application of the following two fundamental principles:

Explicit classification of individual things or explicit specialisation of classes, with an unlimited number of classes in a dictionary.
The Gellish database table enables to store any kind of object; because any individual object can be introduced by specification of an explicit classification relation between the object and a class, whereas classes (kinds of objects or concepts) can be selected from the very large number of classes that are already defined in the Gellish English Dictionary and if the proper class is not available it can be added by specification of a subtype-supertype relation with a direct supertype of the new class. This is fundamentally different from conventional databases that predefine the object types (classes) about which information can be stored by defining a limited number of entity types and attribute types in a fixed data model.
Explicit classification of relations (facts), by an extensible unlimited number of standardised relation types.
The Gellish database table enables to store any kind of fact about any kind of object, because any fact is expressed by a relation, whereas those relations are explicitly classified by relation types that can be selected from the standardised relation types that are defined in the Gellish Dictionary or by relation types that are added to the dictionary as proprietary extensions. This is fundamentally different from conventional databases that predefine a fixed and limited number of relation types between the columns in the database tables (whereas unfortunately those relation types are usually defined only in an implicit way).

As a consequence, a Gellish database does not need to be modified or extended when the scope of an application changes and facts from different Gellish databases can be merged and integrated whenever required without a need for a conversion exercise. br Furthermore the content of a Gellish Database uses a common Gellish Dictionary for all its data, including for example, equipment types, property types, document types, activity types, etc.

3.1 Gellish Expressions in a Gellish Database

A Gellish Database is a database that contains one or more standardised Gellish Database tables. Each such table contains the same predefined columns and is suitable for the expression of virtually any kind of fact such that is computer interpretable and system independent. The table can be implemented as an MSAccess database table, an SQL database table or simply as a standard table in a spreadsheet. The core of a Gellish Database table consists of three columns, just as is the case in RDF/Notation 3. Each row with those three columns in such a table expresses a main (binary) fact. For example, the fact that the Eiffel tower is located in Paris can be expressed as follows:

Left hand object	Relation type	Right hand object
The Eiffel tower	is located in	Paris
The Eiffel tower	is classified as a	tower
Paris	is classified as a	city

The left hand objects and the right hand objects may either be selected from the Gellish English dictionary or may be new proprietary objects that are introduced by defining them on separate lines. If such a new object is an individual thing, then it shall be defined by a classification relation with a class, as is done in the above table and if the nwe object is a class, then it shall be defined on a separate line by a specialisation relation with their direct supertype. The relation types (such as 'is located in' and 'is classified as a') shall be selected from the Gellish English dictionary, otherwise the expression cannot be called standard Gellish, but becomes a proprietary extension of Gellish English.

3.2 Multi-language support

Furthermore, a Gellish database structure supports the simultaneous use of multiple languages. This is enabled because a Gellish database table contains a separate column for the language in which a fact is expressed (see the example table below). Thus a Gellish database supports the use of various natural language specific versions of Gellish. In principle, there is a Gellish variant language for each natural language, depending on the availability of a translation of the Gellish concepts. For example, the Gellish English Dictionary defines Gellish English, and contains partial translations to Gellish Deutsch (German) and Gellish Nederlands (Dutch). International terminology (such as most units of measure and mathematical concepts) is included as International Gellish.

3.3 Unique identifiers, homonyms, synonyms and automatic translation

A Gellish database uses a unique identifier for each thing, irrespective whether it is a user object, a concept from the Gellish dictionary, a fact or a relation type. The following Gellish database table is an extended version of the above example and includes the language in which the fact is expressed as well as the identifiers of the objects.

Language	UID of left hand objet	Name of left hand object	UID of fact	UID of relation type	Name of relation type	UID of right hand object	Name of right hand object
English	1	The Eiffel tower	101	5138	is located in	2700887	Paris
English	1	The Eiffel tower	102	1225	is classified as a	40903	tower
Dutch	1	De Eiffel toren	103	4691	is a translation of	1	The Eiffel tower

The unique identifiers enable the use of synonyms and homonyms and enable that a computer can automatically translate a Gellish expression in a certain language into a Gellish expression in another language. This is caused by the fact that the meaning of a Gellish expression is captured as a relation between the unique identifiers, so that the meaning is language independent.
This adds automatic translation capabilities to Gellish expressions, because a Gellish message can be created e.g. in Gellish English whereas computer software can present it in another Gellish variant, such as Gellish Dutch if a dictionary or a translation is available, such as on the third line in the above table.

3.4 Auxiliary facts

A full Gellish database table has a number of additional columns that enable the expression of auxiliary facts or data about the main facts. For example, columns for:

a textual definition of the left hand object
the context in which a fact is valid
a unit of measure with its UID
the status of the fact (accepted, proposed, deleted, replaced, etc.)
the originator of the fact
the date of creation of the fact
etc.

4. Gellish Database Table Definition

The document 'The Gellish Database Definition' defines the full set of columns in each table of a Gellish Database. It also defines a number of standardised subsets for usage in applications that do not require the full number of columns.
One of those subsets, the Business Model subset, is suitable for nearly all database contents data exchange usecases that describe knowledge and propositions. It application range includes business communication about both designs (imaginary objects) as well as real world objects (observed individual objects) during their lifecycle and about enquiries, answers, orders, confirmations, etc. This table is a superset (indicated in bold) of the product model table, so it can also be used for knowledge about classes of objects.
This subset consists of the following columns in the indicated sequence:
0, 54, 71, 16, '39', 2, 44, 101, '43', '19', '18', 1, 60, 3, '42', 15, 45, 201, 65, 4, 66, 7, 14, 8, 67, 9, 10, 12, 13, 50, 68.

4.1 The Gellish database table header definition

Each Gellish database table file has in principle a table header as illustrated in Figure 3, extended with additional columns as described in this chapter.
A Gellish database table can consist either of a complete set of columns or of one of the pre-defined subsets of columns as described above.
Each column has a column ID and a column name and has a meaning as defined below.
Note that the presence of a value in a column field implies one or more relations with values in other columns as described below. Those relations define the facts about the objects!

If the table is implemented in a spreadsheet or ASCII or Unicode file, then the table starts with a header of three lines, as follows:

The first line contains a sequence of the following four fields A1, A2, A3 and A4, which shall contain the following text:

A1 = ’Gellish’
A2 = ‘Version:’
A3 = version number of the applicable Gellish dictionary
A4 = date of the release of the facts in this table (optional).
followed by free text fields.

The second line contains the column ID’s which consists of standard numbers, although arbitrarily chosen. They allow the columns to be presented in a different sequence without loss of meaning (the numbers below correspond to those column ID’s).
The third line contains human readable text in every column field providing a short name of the column. This name is free text.

4.2 The Gellish database table body column definitions

The lines in a Gellish database table are independent of each other and thus the lines may be sorted in any sequence, without loss of semantics (meaning).

Each line (row) in the body of a Gellish database table (which in a spreadsheet starts on the fourth line) expresses a group of facts, which consists of a main fact and a number of auxiliary facts.

Main fact.
A main fact is expressed by a combination of the following three objects in the columns:

A left hand object id (2), a fact id (1) and a right hand object id (15).

Prime auxiliary facts.

The prime auxiliary facts are expressed by the following pairs of objects (the third object that identifies the fact is left implicit, but should be made explicit in a database):

The relation between the left hand object id (2) and the left hand object name (101).
The relation between the right hand object id (15) and the right hand object name (201).
The relation between the fact id (1) and the relation type id (60).
The relation between the relation type id (60) and its name (3).

Secondary auxiliary facts.

The secondary auxiliary facts are expressed by the pairs of objects that form the context for the validity of the id’s and names for objects identified by their id’s:

The relation between the main fact (1) and its validity context (18).
The relation between the left hand object id (2) and its uniqueness context (17).
The relation between the right hand object id (15) and its uniqueness context (52).
The relation between the uniqueness context for the left hand name (16) and the relation between left hand object id and left hand object name (2, 101).
The relation between the uniqueness context for the right hand name (55) and the relation between right hand object id and right hand object name (15, 201).

Ternary auxiliary facts.

Some ternary auxiliary facts as described in the table below.

Dependent on the type of main fact (the main relation and its relation type) slightly different auxiliary facts can be distinguished and thus slightly different conventions are used to fill in the fields on the line as indicated in the table below.

Several columns contain unique identifiers (UID’s). Each UID should preferably be represented by a 64-bit integer (8-byte, Int64 or bigint'),' whereas only positive values shall be used. It is not recommended to use an unsigned integer (which only allows positive values) because SQL only enables the bigint datatype, which is signed. br Most other columns contain character string values. For database implementations it is indicated whether they have a fixed or variable length (nvarchar of varchar) or whether the string is externally stored (data types ntext and text). In addition to that it is indicated whether the cells may contain Unicode. br Fields in columns that are indicated as optional may be left empty, in which case the indicated default value is applicable. Otherwise a field value is obligatory.

The table columns in a Gellish database table are defined as follows (the Col id numbers correspond with the column ID’s in a table):

Col id	Column name (name in database)	Data type, Optionality, Default, Description
0	Presentation key (Sequence)	string (optional), default null. A presentation key indicates a position or field in a presentation structure, such as a spreadsheet or a list of lines. It can support sorting the content of a Gellish database table. It has no contribution to the meaning of the facts represented on the line. The presentation key does not effect the meaning of the lines. This column can be arbitrarily filled-in for use in a specific context.
69	Unique language identifier (LanguageUID)	integer (optional), 64 bit, default null. The unique identifier of the language in which the name of the left hand object (see column 101) and the name of the relation type (see column 3) is spelled and, if present, in which the definition (see column 63 and 4) is spelled. The language is a context for the origin of the referencing relation between the UID and the string that is the name.
54	Name of language of left hand object name (Language)	string (optional), Unicode, nvarchar(255), default null. The name of the language of the left hand object name indicates the name of the language for which a UID is given in column 69 and that is a context for the name of the left hand object (see column 101) and the name of the relation type (see column 3). If the relation type name is not available in that language, it may be given in English. The allowed values for ‘language name’ are the names defined in the Gellish Dictionary (or your private extension). Currently the dictionary contains names of natural languages and of (artificial) programming languages. For example: - natural language is a conceptualization of English, French (francais), German (Deutsch), etc. The language ‘International’ shall be used to indicate strings that are language independent.
17	Uniqueness context for left hand object id. (ContextLHUID)	string (optional), Unicode, nvarchar(255), default ‘Gellish’. The uniqueness context for left hand object id provides the context within which the left hand object id, given in column 2, is a unique reference to something. The default context is 'Gellish'. This column is superfluous in normal Gellish, because each object has a Gellish UID and an id in another context can be specified as a name (or synonym) of the object with the Gellish UID. The column is intended for research in the field of multi-language, multi-context integration.
2	Unique left hand object identifier (UID-2) (LHObjectUID)	integer, 64 bit. A unique left hand object identifier is the identifier of the main object about which the line defines a fact. That main fact is an association between two objects mentioned in column 2 and 15. The external identifier (name) of the object in column 2 can be given in column 56 with its text attribute in column 101 ‘name of left hand object’. A `'UID`' is an artificial sequence number, provided it is unique in a managed context. For example, the UID 4724 is a reference number of a telephone extension in the context of my company in The Hague. An identical number may refer to a different object in a different context, such as the extension with UID 4724 in the context of your company. The uniqueness context is given in column 16 (subject area). Such a context itself is defined on a separate line in a Gellish database table. Note, that a fact represented by an association or relationship is also an object.
71	Uniqueness context identifier for left hand object name (UID-7) (LHContextUID)	integer (optional), 64 bit, default null. The uniqueness context identifier for left hand object name, also called the identifier of the language community, provides the context within which the left hand object name in column 101 is a unique reference to the object id in column 2, in addition to the language context (see column 69 and 54). The context is superfluous (and is for human clarification only) on all lines other than lines with a specialization, a qualification a classification or an alias relation and their subtypes, because only there the left hand objects, identified by their UID, are `defined` to have a name. If no context is given on a definition line, then the name for the left hand object is unique in the whole (natural) language (column 54) and no homonyms are then allowed (in the Gellish Dictionary).
16	Uniqueness context name for left hand object name (language community) (LHContextName)	string (optional), Unicode, nvarchar(255), default null. The uniqueness context name for left hand object name is the name for the uniqueness context of which the identifier is given in column 71. The name is optional (and is for human clarification only) because the context UID in column 71 shall be a reference to a context that is defined on another line, where its UID and name appears in columns 2 and 101 respectivily.
38	Left hand object type name (LHObjectType)	string (optional), Unicode, nvarchar(255), default null. An object type of the left hand object (with the UID in column 2) indicates the name of the entity type of the left hand object in a particular data model about which the line defines the main fact. This column is superfluous in Gellish as it can be inferred via inheritance from the mapping of the appropriate object or its classifying class in the Gellish specialization hierarchy to the entity in appropriate data model.
39	Reality (LHReality)	string (optional), Unicode, nvarchar(255), default null. The reality is a classification of the left hand object, being either imaginary or materialized (= real). This indicates that the object is either a product of the mind or an object whose existence is based in the physical world, either as natural or as artificial object. If not specified, then the reality shall be interpreted from the context or from a explicit classification fact. For example, during design a pump will be an imaginary (although realistic) object, when fabricated a pump will be a materialized object. Note that an object cannot be imaginary and materialized. An installation relation relates an imaginary object to a materialized object. Classes are always imaginary.
56	Identifier of left hand term (UID-6) (LHTermUID)	integer (optional), 64 bit, default null. The identifier of left hand term is the unique identifier of the name in column 101, which is a name of the object identified in column 2. It is the UID of the encoded information to which the text in column 101 refers and vice versa. Basically this column is superfluous and is therefore left blank, because each unique string in column 101 identifies itself as a unique string. Therefore the string itself is its own unique identifier.
44	Left hand object cardinalities (LHCardinalities)	string (optional), non-unicode, varchar(32), default null. For common associations between classes this column contains the simultaneous cardinalities for the left hand object class. This means that it indicates the minimum and maximum number of members of the class that can be associated with a member of the right hand object class at the same time. The cardinalities may be specified by: - a comma separated list of two integers that indicate the lower and upper limit cardinalities. The upper limit may be the character ‘n’ to indicate that the upper limit is unlimited.
101	Left hand object name (LHObjectName)	string, Unicode, nvarchar(255). A ‘name’ of the object identified in column 2 and associated with it via an “is referenced as” association in a context referred to in column 54. For example, a tag name or some other code. It is the attribute of the encoded information identified in column 56. When there is no ID filled-in in column 56, then the text is only present for easy human reference to an object. It facilitates readability when the lines are sorted in a different sequence later. Nameless objects can exist, which implies that there is no instance in column 56 and 101 for an object in column 2.
72	Identifier of left hand role (LHRoleUID)	integer (optional), 64 bit, default null. An identifier of left hand role identifies the role that is played by the left hand object in column 2. This role is implicitly classified or is implicitly a subtype of the first or second kind of role that is required by the kind of relation in column 60.
73	Name of left hand role (LHRoleName)	string (optional), Unicode, nvarchar(255), default null. A name of left hand role is the name of the role in column 72.
43	Intention (Intention)	string (optional), non-unicode, varchar(255), default ‘true’. An intention indicates the extent to which the main fact is the case or is the case according to the author of a proposition. An intention includes also a level of truth. If a line expresses a proposition or communication fact, then the intention qualifies the proposition. If a line expresses a fact, then the intention indicates whether the relation of the type is true or false (not) or questionable (maybe). For example, the intention may indicate that a proposition is an affirmative request (question), confirmation, promise, declination, statement, denial, probability or acceptance. Default = ‘true’, which means a qualification of the statement: this fact “is the case”.
19	Unique identifier of validity context for main fact (ValContextUID)	integer (optional), 64 bit, default null. The unique identifier of validity context for main fact identifies the context within which the fact id, given in column 1, represents a valid fact. If not given, the fact is valid in all contexts.
18	Validity context name (!ValContextName)	string (optional), Unicode, nvarchar(255), default null. The validity context name provides a name of the context that is identified in column 19.
1		Unique identifier of main fact (UID-1) (FactUID)
60		Relation type ID (RelTypeUID)
3		Relation type name (Gellish) (!RelTypeName)
74		Identifier of right hand role (RHRoleUID)
75		Name of right hand role (RHRoleName)
52		Context name for right hand object id (RHUniqueContext)
15		Right hand unique object identifier (UID-3) (RHObjectUID)
45		Right hand object cardinalities (RHCardinalities)
55		Uniqueness context for right hand object name (RHUnContextName)
42		Description of main fact (template text) (!FactDescription)
201		Right hand object name (RHObjectName)
65		Partial description (!PartialDefinition)
4		Full definition (!FullDefinition)
66		Unit of measure identifier (UoMUID)
7		Unit of measure name (UoM) (UoMName)
76		Accuracy of mapping UID (AccuracyUID)
77		Accuracy of mapping name (!AccuracyName)
70		Picklist UID (DomainUID)
20		Picklist name (!DomainName)
14		Remarks (Remarks)
8		Approval status of main fact (!ApprovalStatus)
67		UID of successing fact (SuccessorUID)
9		Date of start of validity (!EffectiveFrom)
10		Date of latest change (end of validity) (!LatestUpdate)
12		Author of latest change (Author)
13		Reference or Source (Reference)
53		Line identifier (UID-5) (LineUID)
50		Unique plural fact identifier (UID-4) - see figure 3. (CollectionUID)
68		Name of collection of facts (!CollectionName)
80		Left hand string commonality (LHCommonality)
81		Right hand string commonality (RHCommonality)

'Continue with '[wiki:“Dictionary Extension”] —- br [http://gellish.wiki.sourceforge.net/page/edit/Gellish+Database?token=264bbd7424cde6f4705a977c96136f36#_ftnref1 [1]] See http://support.microsoft.com/kb/q180162/

Gellish Wiki

Sidebar

Table of Contents

1. Gellish Database Design

2. Limitations of conventional databases

3. Characteristics of a Gellish Database

3.1 Gellish Expressions in a Gellish Database

3.2 Multi-language support

3.3 Unique identifiers, homonyms, synonyms and automatic translation

3.4 Auxiliary facts

4. Gellish Database Table Definition

4.1 The Gellish database table header definition

4.2 The Gellish database table body column definitions

Gellish Wiki

User Tools

Site Tools

Sidebar

Table of Contents

1. Gellish Database Design

2. Limitations of conventional databases

3. Characteristics of a Gellish Database

3.1 Gellish Expressions in a Gellish Database

3.2 Multi-language support

3.3 Unique identifiers, homonyms, synonyms and automatic translation

3.4 Auxiliary facts

4. Gellish Database Table Definition

4.1 The Gellish database table header definition

4.2 The Gellish database table body column definitions

Page Tools