User Tools

Site Tools


Data Modeling and Database Design in Gellish English

1. Limitations of conventional database design

Data modelling for an application domain typically results in a domain specific conceptual data model, also called a conceptual scheme. Such a data model is intended to document the requirements for storage of data in a database and thus defines the semantic capabilities for expressions of facts that can be stored in the future database.
In conventional data modelling methodologies, such as ORM (Object-Role Modelling) and others, each conceptual data models consists of a definition of the concepts (classes, entity types and attribute types) that are relevant for the application domain, together with relations (relation types) between those concepts. The concepts will classify the instances of objects and their aspects, whereas those relation types determine the kinds of facts (instances) that can be stored in the database. Such a conceptual data model is then converted into a logical and finally in a physical data model that defines the structure of the database (the definitions of the database tables and relations between them) that will be created.
A shortcoming of this database development process is that the storage capabilities of the resulting databases are fixed, especially because its data structure in dedicated en limited to the application domain (the Universe of Discourse) and that it is time consuming and costly to modify the structure and scope of such databases. = 2. The Gellish Modeling Method =

The Gellish Modeling Method extents the conventional data modeling methods in various ways:

  • It makes use of a smart Gellish English dictionary in which concepts and relation types are already defined, so that only definitions of additional concepts and relation types need to be added.
  • It defines a universal standardised Gellish Database table that enables to store any kind of fact.
  • It adds the capability to directly instantiate the conceptual data model.
  • It provides the flexibility to modify the data model without the need to regenerate the database.

To achieve this, the conceptual data model is expressed in Gellish English and stored in a standardised Gellish database. Then the data model is not used to create the database structure, but is directly used to guide the creation of instances that are stored in the same standardised Gellish database table. This means that in Gellish English the database design process is simplified and significantly reduced. It also means that universal application software can be created that enables the storage and retrieval of any kind of fact.
Below it is described how to create systems that can use a flexible data model that is stored in a Gellish database and how to create a flexible user interface that allows to enter a large variety of information without the need to modify the database structure.

2.1 Conceptual data models in Gellish

A data analysis and data modelling process converts domain knowledge of a ‘Universe of Discourse’ into a conceptual data model (also called a knowledge model). The Gellish methodology requires that during such a process the Gellish English dictionary shall be used in order to express the data model by using standard concepts and standard relation types and not reinventing concepts and definitions. When concepts or relation types are not available in the Gellish English dictionary, then they shall be added as proprietary extensions according to the rules for extension of the Gellish English dictionary. After that, the data model and the dictionary are used by application software to guide the process to create database instances.

A data model consist of facts about kinds of things. Such facts can be used by application software to create instances, which are facts about individual things. In other words, application software creates facts about individual things on the basis of knowledge facts about kinds of things.

For example, a data model that defines the information that should be stored about pumps will be used to guide the process to express facts (information) about individual pumps. The result of such a process to describe an individual pump will be a collection of facts about the individual pump, but in fact the process to express facts about other individual physical objects is basically the same, although other kinds of aspects will be relevant for the other objects. For example, facts about a person, a piece of software, a process, or facts about relations between them. Each such fact is expressed by a relation between objects, whereas the nature of those relations are defined by the (standard) relation types that classify the relation.
The data model can be presented a graphical form, for example in an ORM schematic drawing, but it can also be presented in the form of a Gellish Database table as follows:

Language communityUID of left hand objectName of left hand objectFact UIDUID of relation typeName of relation typeUID of right hand objectName of right hand object
rotating eq. 130206 pump 101 2069 can have as aspect a 551564 capacity
rotating eq. 130206 pump 102 1191 can have as part a 130030 bearing
rotating eq. 130058 centrifugal pump 103 1146 is a specialization of 130206 pump
rotating eq. 130058 centrifugal pump 104 1191 can have as part a 130144 impeller
rotating eq. 130144 impeller 105 2069 can have as aspect a 550188 diameter

Table 1, Data model for data about pumps

All concepts (kinds of things) that are used in Table 1, such as pump, centrifugal pump, impeller, and properties, such as capacity and diameter, are already defined in the Gellish English Dictionary and thus their Gellish unique identifiers (the UID’s 130206, 130058, etc.) are selected from that dictionary. Also the relation types are selected from the Gellish Dictionary. The only new things in this table are the facts, indicated by the Fact UID’s, that express the knowledge, except for fact 103, which is not new. Fact 103 is a superfluous duplicate from the fact that already exists in the Gellish Dictionary, because such subtype-supertype relationships are part of the definition of the concepts in the Gellish English Dictionary, which makes it a smart dictionary that is arranged as a taxonomy.

2.1.1 Inheritance

Note that the taxonomy, the subtype-supertype hierarchy, defines the inheritance of knowledge that is defined for supertype kinds of things to the subtypes kinds of things. This means for this example that fact 103 ensures that the concept ‘centrifugal pump’ inherits from the concept ‘pump’ that a centrifugal pump also can have a capacity, without an explicit specification of that fact.

2.1.2 Dictionary extensions

If the data modelling process (also called the knowledge modelling process) identifies concepts or relation types that are required but that do not exist yet in the Gellish English Dictionary, then those concepts and relation types should be added to the Gellish English Dictionary as proprietary extensions. Such additions shall be compliant with the rules for extension of the Gellish language. It is recommended to nominate such extensions for inclusion in the Gellish Dictionary.

For details of the process to convert domain knowledge and requirements into a data model it is recommended to use a conventional data modelling methodology, such as ORM, in combination with the guidelines on the modelling of knowledge and specifications. Here we assume that a data model that is expressed in Gellish English and stored in a Gellish database table is available.

2.2 A data model stored as instances in a universal Gellish database

In conventional data modelling it is common practice to convert a conceptual data model (also called a knowledge model) into a database design (a physical data model). This is also possible for the above example knowledge. However, it is not necessary for a Gellish database, because a Gellish database has a predefined general purpose data structure and consists of tables that all have the same standardized database table definition (see ‘The Gellish Database table definition’). This means that for a Gellish database there is no separate database design required. Table 1 is an example of the core of such a Gellish database table, loaded with facts that specify the data model with knowledge about pumps.

This means that the data model that is specified in Table 1 does not need to be converted into a database design, but can be directly used to guide the process to create instances that can be stored in a Gellish database table with the same structure (although with other relation types). This is done as is described below.

2.3 Creation of instances

Each fact in a data model that is expressed in Gellish English as a relation between concepts can result in “instances” that consist of one or more facts that are expressed in Gellish English as a relation between individual things, together with classification relations between the individual things and the concepts (classes) that classify them.
For example, assume that we want to store facts about a particular centrifugal pump, which is called P-1301. In other words we want to create an individual thing that is classified as a centrifugal pump and we want to create expressions of facts with information about that individual thing. To create that individual thing in Gellish it is required to specify a classification relation between the individual thing P-1301 and the concept (class) that classifies the thing. Such a fact is expressed in Gellish English as follows:

  • P-1301 <is classified as a> centrifugal pump (fact 301)

This explicit classification relation implies that the data model with knowledge about centrifugal pumps as given in Table 1 is applicable for P-1301, including also the inherited knowledge facts 101 and 102.
Note that the data model in Table 1 uses concepts from the Gellish dictionary (taxonomy) in which a large number of subtypes of pumps are defined. For example, it is defined that a line shaft pump is a subtype of centrifugal pump. This means that the knowledge about pumps and centrifugal pumps can also be made applicable for their subtypes, such as for line shaft pumps. Thus the data model enables also to classify P-1301 as any type of pump, including also as a line shaft pump.

In Gellish English there is a distinction between relation types that are used to express knowledge (facts about kinds of things) and relation types that are used to express information (facts about individual things). For example, fact 101 expresses the knowledge that a

  • pump <can have as aspect a> capacity (fact 101)

This knowledge can be used to create a fact with information about P-1301 that states that P-1301 has a particular capacity, called cap-1301 (say). For the expression of that latter fact another relation type shall be used in Gellish as follows:

  • P-1301 <has as aspect> cap-1301 (fact 302)

In the Gellish Dictionary it is defined which relation type shall be used to create information about individual things as a realization of a relation type used to express knowledge. Table 2 illustrates which relation types should be used to create realizations of the relation types used in Table 1.

Relation type to express knowledgeName of relation typeRelation type to express information
can have as aspect a can be realized by a has as aspect
can have as part a can be realized by a has as part

Table 2, Relation types for product models versus relation types for knowledge models

Furthermore, it is a rule in Gellish that every individual thing shall be classified. So, when fact 302 about P-1301 is created this rule implies that P-1301 as well as cap-1301 shall be classified, whereas it is clear that they shall be classified by the concepts (classes) that are the left hand and right hand objects in fact that expresses the knowledge (fact 101). Furthermore every individual aspect shall be qualified in such a way that, when the aspect is classified by a subtype of property, then it can be either qualified by a qualitative aspect (e.g. a colour can be qualified as ‘red’) or can be quantified on a scale (e.g. a diameter can be quantified by a number on a scale as 300 mm). So, knowledge fact 101 will result in the following facts about the individual object P-1301:

Language communityUID of left hand objectName of left hand objectFact UIDUID of relation typeName of relation typeUID of right hand objectName of right hand objectUID of UoMName of UoM
Project A 501 P-1301 301 1225 is classified as a 130058 centrifugal pump
Project A 501 P-1301 302 1727 has as aspect 502 cap-1301
Project A 502 cap-1301 303 1225 is classified as a 551564 capacity
Project A 502 cap-1301 304 5025 has on scale a value equal to 920466 300 570423 mm

Table 3, Gellish facts (instances) created on the bases of a Gellish knowledge facts (a data model)

Continue with Development of Gellish enabled software

data_modeling_and_database_design_in_gellish_english.txt · Last modified: 2018/02/15 12:29 (external edit)