User Tools

Site Tools


automated_translation

Automated Translation

1. Language independent UID's

In Gellish each object is represented by its own UID which is natural language independent, whereas each object is accompanied by a language dependent term to enable human interpretation in the natural language that is used. So, any object has only one UID, but it may have many 'names', being the terms, codes and synonyms by which it may be denoted in different languages. For example, UID 40153 is a language independent representation of a particular concept that is denoted in English by the term road and in Dutch by the term weg. Also individual objects as well as relation types and facts have their unique UID’s and their language dependent names or phrases. br Each Gellish expression is normally a sequence of such UID’s, terms and a phrase and therefore those expressions have a natural language independent part (the UID's) and a natural language dependent part (the 'names'). In addition to that Gellish defines how the elements in the expression (the columns in the Gellish tables) relate to each other. The definition of those columns and the relations between them forms the language independent grammar of the Gellish language.
This is illustrated by the following examples of expressions in a Gellish English database table:

LanguageUID of LH objectName of LH objectUID of FactUID of rel. typeName of relation typeUID of RH objectName of RH object
English 40153 road 201 2069 <can have as aspect a> 550464 width
International 102 B-23 202 1225 <is classified as a> 40153 road
Dutch 40153 weg 203 4691 <is a translation of> 40153 road
Dutch 550464 breedte 204 4691 <is a translation of> 550464 width

Table 1 , Gellish English

The language in the first column indicates on each line the language of the expression, in particular the language of the left hand (LH) term, the relation type, the right hand (RH) term and the description (not shown in these tables). When relation type 4691 <is a translation of> is used, then the language of the left hand term and the language of the right hand term are different.
The first fact (201) expresses the general knowledge that a road can have as aspect a width. This fact is true, independent of the language in which that fact is expressed. That fact (201) is therefore expressed in a language independent way by the combination of the UID’s:
*(40153, 2069, 550464)

All three UID’s in this example are selected from the Gellish English Dictionary, although their combination in fact 201 is new.

The second fact (202) in Table 1 is an expression that specifies the name of a particular individual road and it indicates that the name of the road is given in an International language. In other words, the name of the road is language independent.

  • Note: The Gellish Dictionary contains a number of concepts that have ‘names’ that are language independent, such as numbers and units of measure. Examples of International ‘names’ for numbers are 1, 2, 3, etc., whereas for decimals such as 3.5 Gellish uses the dot as International separator, although a number of languages such as Dutch and German use a comma (‘,’) as separator). Examples of International ‘names’ of units of measure are mm, m, km, bar, psi, deg C, deg F, etc.

2. Automated translation

The four lines of Table 1 are sufficient for a smart piece of software to present the same facts in the Dutch language (assuming that translations of the relation types are also available). This is possible, because the UID’s of the concepts are language independent and fact 203 and 204 on the third and fourth line give the translation of the names of the used concepts in Dutch. This means that a Gellish enabled piece of software is able to create Table 2 from an interpretation of the content of Table 1!

This also illustrates that there is no need to specify the same facts in another language. Gellish enabled software only needs a Dictionary of the other language in order to be able to present the facts also in that other language.
Nevertheless an automatically translated Gellish Database might be stored as in Table 2.

LanguageUID of LH objectName of LH objectUID of FactUID of rel. typeName of relation typeUID of RH objectName of RH object
Nederlands 40153 weg 201 2069 <kan als aspect hebben een> 550464 breedte
Internationaal 102 B-23 202 1225 <is geclassificeerd als een> 40153 weg
Nederlands 40153 weg 203 4691 <is een vertaling van> 40153 road
Nederlands 550464 breedte 204 4691 <is een vertaling van> 550464 width

Table 2, Gellish Nederlands

3. Multi-language support

Comparison of the content of Table 1 and Table 2 illustrates that they represent the same facts in two different languages. So the facts are the same, but the expressions are different. The facts with UID 201, 202, 203 and 204 are facts that are true, independent of any language in which the facts are expressed.

The last two facts (203 and 204) in Table 1 are expressions in English about the translation of the term road and the term width. They represent facts in an English – Dutch dictionary. Therefore, the name of the language in the first column (‘Dutch’) is given in English (!), because the name of that language in Dutch would have been ‘Nederlands’. The equivalent facts 203 and 204 in Table 2 however represent expressions in Dutch about the same facts in a Dutch dictionary. They represent the facts as they appears in a Nederlands – Engels woordenboek (dictionary).

Table 1 and Table 2 illustrate that the UID of a fact remains the same, even if the language for the expression changes. This means that a database that includes expressions in multiple languages may have multiple lines with the same fact UID’s. To distinguish such lines those lines will have different line UID’s. Note that to save space the line UID’s are not shown in Table 1 and Table 2.

4. ASCII and Unicode

As the ASCII character set is not sufficient to represent the characters in many languages and also for the names of units of measure, Gellish is by default expressed in Unicode, but may also use ASCII. The used character set is indicated by a separate parameter.

Continue with Change Management

automated_translation.txt · Last modified: 2017/08/11 15:10 (external edit)