IDEF1X – Data Modeling Method

Overview

IDEF1X is a method for designing relational databases with a syntax designed to support the semantic constructs necessary in developing a conceptual schema. A conceptual schema is a single integrated definition of the enterprise data that is unbiased toward any single application and independent of its access and physical storage. Because it is a design method, IDEF1X is not particularly suited to serve as an AS-IS analysis tool, although it is often used in that capacity as an alternative to IDEF1. IDEF1X is most useful for logical database design after the information requirements are known and the decision to implement a relational database has been made. Hence, the IDEF1X system perspective is focused on the actual data elements in a relational database. If the target system is not a relational system, for example, an object-oriented system, IDEF1X is not the best method.

There are several reasons why IDEF1X is not well-suited for non-relational system implementations. IDEF1X requires, for example, that the modeler designate a key class to distinguish one entity from another, whereas object-oriented systems do not require keys to individuate one object from another. Further, in those situations where more than one attribute or set of attributes will serve equally well for individuating IDEF1X entities, the modeler must designate one as the primary key and list all others as alternate keys. Explicit foreign key labeling is also required. The resulting logical design IDEF1X models are intended to be used by the programmers who take the blueprint for the logical database design and implement that design. However, the IDEF1X modeling language is sufficiently similar to IDEF1 in that models generated from the IDEF1 information requirements can be reviewed and understood by the ultimate users of the proposed system.

In December 1993, the Computer Systems Laboratory of the National Institute of Standards and Technology (NIST) released IDEF1X as a standard for Data Modeling in FIPS Publication 184. In 2012 IEEE adopted IDEF1X and can be found here.

IDEF1X Concepts

Although the terminology between IDEF1 and IDEF1X is very similar, there are fundamental differences in the theoretical foundations and concepts of the two methods. An entity in IDEF1X refers to a collection or set of similar data instances that can be individually distinguished from one another. Individual members of the set are called entity instances. Thus, a box in IDEF1X represents a set of data items in the real-world realm. An attribute is a slot value associated with each individual member of the set. The relationship that exists between individual members of these sets is given a name. In this case, this relation establishes a referential integrity constraint.

A powerful feature of IDEF1X is its support for modeling logical data types through the use of a classification structure or generalization/specialization construct. This construct is an attempt to overlay models of the natural kinds of things that the data represents whereas the boxes, or entities, attempt to model types of data things. These categorization relationships represent mutually exclusive subsets of a generic entity or set. Subsets of the superset cannot have common instances. For example, a generic entity PERSON has two subsets representing all complete categories, namely, MALE and FEMALE. No instance of the MALE subset can be an instance of the FEMALE subset, and vice versa. The unique identifier attribute for each subset is the same attribute as that for a generic entity instance.

Syntax and Semantics of IDEF1X

Entities

In IDEF1X, entities are either identifier-independent or identifier-dependent. Instances of identifier-independent entities can exist without any other entity instance, while instances of identifier-dependent entities are meaningless (by definition) without another associated entity instance. Dependence and independence are specific to a model.

Connection Relationships

Connection relationships (solid or dashed lines with filled circles at one or both ends) denote how entities (sets of data instances) relate to one another. The connection relationships are always between exactly two entities. The connection relationship beginning at the independent parent entity and ending at the dependent child entity is labeled with a verb phrase describing the relationship. Each connection relationship has an associated cardinality. The cardinality specifies the number of instances of the dependent entity that are related to an instance of the independent entity.

Categorization Relationships

Categorization relationships allow the modeler to define the category of an entity. An entity can belong to only one category. For instance, there could be an entity CAR that is the generic entity in a category showing different types of cars. Each category entity must have the same primary key as CAR. Also, there must be a way of distinguishing between the category entities. The category entities are distinguished by a discriminator attribute which must have a different value for each category entity.

Attributes

Attributes are properties used to describe an entity. Attribute names are unique throughout an IDEF1X model, and the meaning of the names must be consistent. For example, the attribute "color" could have several possible uses for hair color, skin color, or a color in a rainbow. Each use has a range of meaningful values, and thus, the entity must be distinctly named. Each attribute is owned by exactly one entity. The attribute "social security number," for example, could be used in many places in a model, but would be owned by only one entity (e.g., PERSON). Other occurrences of the social security number attributes would be inherited across relations.

Every attribute must have a value (No-Null Rule), and no attribute may have multiple values (No-Repeat Rule). Rules enforce creating proper models. In a situation where it seems that a rule cannot hold, the model is likely wrong.

Keys

A key is a group of attributes that uniquely identify an entity instance. There are primary and alternate keys. Every entity has exactly one primary key displayed above the horizontal line in the entity box. Entities can have alternate keys that also uniquely identify the entity, but are not used for describing relationships with other entities.

In a connection relationship, the primary key of the parent migrates to the child. If the relationship is a category relation, the primary key of the child is the same as the generic. If the relationship is an identifying relationship, the primary key of the child must contain attributes inherited from the parent.

Besides the fact that a key must uniquely identify an entity, all attributes in the key must contribute to the unique identification (Smallest-Key Rule). Thus, when deciding whether an inherited attribute should be part of a key, an issue is whether that attribute is necessary for unique identification. It is not sufficient that it contributed to the unique identification of the parent.

There are also two dependency rules: The Full-Functional-Dependency Rule states that if the primary key is composed of multiple attributes, all non-key attributes must be functionally dependent on the entire primary key. The No-Transitive-Dependency Rule states that every non-key attribute must be functionally dependent only on key attributes.

Foreign Keys

Foreign keys are not really keys at all, but attributes inherited from the primary keys of other entities. Foreign keys are labeled (FK) to show that they are not owned by that entity. Foreign keys are significant because they show the relationships between entities. Because entities are described by their attributes, if an entity is composed of attributes inherited from other entities, that entity is similar to those entities.

Strengths of IDEF1X

IDEF1X is a powerful tool for data modeling even though there are numerous other data modeling methods including ER and ENALIM. One strength of IDEF1X lies in its roots. Due to the strict standardization of DoD projects, IDEF1X will probably escape having the numerous variants that have hindered the use of ER. Having a standard and adhering to it are crucial to transferring knowledge between organizations.

A weakness of nearly all methods including IDEF1X is that the modeler must be experienced in order to create good models. Modeling is not an intuitive process, and many times models will be discarded due to a poor start. The simpler the method is to use, the better, but the method must still have the necessary expressive power. A good example of a powerful concept which can be abused is the category relation. Whereas there are cases when categories are necessary, there are other cases when they are used to create meaningless entities. Many inexperienced IDEF1X modelers tend to fall into the trap of using the categorization features of IDEF1X to model natural taxonomies as opposed to data taxonomies as they are intended. Because of the categorization components of the IDEF1X method, many domain experts have fallen into the trap of using the method for concept and terminology definition. Unfortunately, the data modeling considerations built into the IDEF1X rules do not allow it to function adequately for this purpose. The result is that much of the information gathered cannot be expressed or is expressed erroneously. For example, to function adequately as a language for concept and terminology definition, IDEF1X would have to be capable of expressing facts such as "A statement of work is a document and is a legal contract" and "A square is a polygon with four equal sides."

KBSI has developed an automated Information and Data Modeling tool, SMARTER^®, to support the IDEF1 and IDEF1X methods.