User Tools

Site Tools


document_management

Document management

This section describes a way of managing and searching for documents using Gellish for storing auxiliary information about the documents and about the objects about which the documents provide information. This includes also management and descriptions of drawings and information that is present on various media such as text files, sound, video, etc.

1. Distinction between information and information carriers

For proper modeling of documents it is important to distinguish between a document as being information and the physical document, being a carrier of the information. The concept ‘document’ is defined in Gellish as information that is presented in one or more physical information carriers. An information carrier can be for example ink on paper, optical or electronic patterns on media, etc. In other words, information is the common content of information carriers that present the same content.
A particular piece of information that can be presented on more than one different physical carrier is called 'qualitative information'. For example, a particular document (a piece of qualitative information) may exist in doc file format as well as in pdf file format, as well as in printed form. This illustrates that the qualitative information is the common content of all those information carriers (physical files and paper). Thus the qualitative information is distinguished from the physical information carriers that present the information.
Qualitative information can be included in a Gellish in two ways: as a text string that is the description of an “information object” or by a reference to a physical file or object that contains the information. The next paragraphs describe both ways.

2. Modeling of textual information

Every object (represented by a UID) is normally related to two text strings: a name and a description (which may be a definition). A name is maximally 255 characters long (Unicode, nvarchar(255)). A description has an unlimited length (Unicode, ntext) in Gellish, although an application system might impose a field length limitation for the name and therefore may require that short named synonyms or abbreviations are specified. A piece of text is an 'information object' (also called 'qualitative information') that is represented by its own UID and which also has a name, whereas the text itself is expressed as the description of the information object. For example, assume that a document about road design expresses requirements for motorways. And assume that the document is decomposed in fragments that are expressed a Gellish so that the knowledge can be made available for designers of roads who might search for requirements about motorways, thus searching for 'motorway'. Assume that Paragraph 3.1 of that document consists of the text: “A motorway shall have …”. Such a description shall be provided in a Gellish expression table on the line that defines the information object, being the line that specifies what kind of information it is (or on a line where a translation of the name is specified). Thus it shall be given on the line that specifies that the information object is a qualitative subtype of some kind of information.
Then the semantic network will contain the following expressions:

UID of left hand objectName of left hand objectUID of ideaUID of kind of relationName of kind of relationUID of right hand objectName of right hand objectFull description
110 motorway 210 5398 shall be compliant with 111 Paragraph 3.1
111 Paragraph 3.1 211 1726 is a qualitative subtype of 970007 requirement A motorway shall have a …

Note that the paragraph (object 111) has a name (Paragraph 3.1), whereas the content of the paragraph is specified in the description of the object on the line where it is defined as being a qualitative subtype of requirement (idea 211).
Thus software can enable that people who search for 'requirement' about 'motorway' will retrieve all requirements about motorways.

3. Documents about objects

In order to facilitate searching and retrieval of documents via the objects about which they provide information it is recommended that the objects about which a document provides information are related to the documents. When those objects are part of a larger assembly it is recommended that the objects are integrated in an integrated model, resulting in integrated information. This will then enable searching and retrieval of documents as well as data about the objects.
For example, the fact that motorway M1 is documented on drawing T-12345 is expressed in Gellish as follows:

UID of left hand objectName of left hand objectUID of ideaUID of kind of relationName of kind of relationUID of right hand objectName of right hand objectFull description
112 motorway M1 212 5046 is described in 120 T-12345
112 motorway M1 213 1225 is classified as a 110 motorway that connects London with Birmingham

4. References to electronic files

Often pieces of information consist of complete documents or do not only contain (Unicode) text, such as drawings, sound or video. In those cases the document will usually be an electronic data file in some file format. For example, a drawing might be stored in AutoCAD dwg file format and may also be expressed in tiff and pdf format. In those cases the physical documents (files) are not included themselves in the Gellish expression table content, but are stored in external files, whereas the expressions contains references to the external files or to other physical objects.
For example, drawing T-12345 is scanned and stored as a tiff file, which is then is converted into a 002.pdf file. The pdf file is stored in the directory 'Examples' on the C drive of the computer. When expressed in Gellish this will result in the following expressions:

UID of left hand objectName of left hand objectUID of ideaUID of kind of relationName of kind of relationUID of right hand objectName of right hand object
120 T-12345 220 1726 is a qualitative subtype of 490196 drawing
120 T-12345 221 4996 is presented on 121 002.pdf
121 002.pdf 222 1225 is classified as a 40153 electronic data file
121 002.pdf 223 1227 is an element of 122 C:\\Examples
122 C:\\Examples 224 1225 is classified as a 492017 directory

5. References to physical copies

The original master copy of a document may be a sheet of paper that is located in archive A. The additional statements that describe that are expressed as follows:

UID of left hand objectName of left hand objectUID of ideaUID of kind of relationName of kind of relationUID of right hand objectName of right hand object
120 T-12345 225 4996 is presented on 123 T-12345 sheet
123 T-12345 sheet 226 1225 is classified as a 492033 sheet of paper
123 T-12345 sheet 227 5138 is located in 124 archive A
124 archive A 228 1225 is classified as a 970492 archive

6. Software requirements for file support

Gellish powered software that supports searching and retrieval of documents on files shall have the following capabilities:

1. Recognize any object that is classified as an ‘electronic data file’.
2. Determine the file type of such an object from the file extension (or from the value of an aspect that is classified as a file format).
3. Determine the collection of files that is classified as a directory.
4. Determine the location of the directory from the name of the collection.
5. Launch the appropriate application that opens the file for viewing or editing.

Continue with Knowledge modeling

document_management.txt · Last modified: 2018/11/04 00:00 by andries