Imieliński-Lipski algebra

An Imieliński-Lipski algebras is an extension of relational algebra onto tables with different types of null values. It is used to operate on relations with incomplete information.

Imieliński-Lipski algebras are defined to satisfy precise conditions for semantically meaningful extension of the usual relational operators, such as projection, selection, union, and join, from operators on relations to operators on relations with various kinds of "null values".

These conditions require that the system be safe in the sense that no incorrect conclusion is derivable by using a specified subset F of the relational operators; and that it be complete in the sense that all valid conclusions expressible by relational expressions using operators in F are in fact derivable in this system. For example, it is well known that the 3-valued logic approach to deal with null values, supported treatment of nulls values by SQL is not complete, see Ullman book.[1] To show this, let T be:

T=
NAMECLASSGRADESEMESTER
RohitDatabasesBSpring
IgorNetworksANULL

Take SQL query Q

                Select NAME
                         From T
                             Where CLASS = 'Networks' & SEMESTER = 'Spring' OR GRADE = 'A' & SEMESTER != 'Spring'

SQL query Q will return empty set (no results) under 3-valued semantics currently adopted by all variants of SQL. This is the case because in SQL, NULL is never equal to any constant – in this case, neither to “Spring” nor “Fall” nor “Winter” (if there is Winter semester in this school). NULL=’Spring” will evaluate to MAYBE and so will NULL=’Fall”. The disjunction MAYBE OR MAYBE evaluates to MAYBE (not TRUE). Thus Igor will not be part of the answer (and of course neither will Rohit). But Igor should be returned as the answer.

Indeed, regardless what semester Igor took the Networks class (no matter what was the unknown value of NULL), the selection condition will be true. This “Igor” will be missed by SQL and the SQL answer won’t be complete according to completeness requirements specified in Imieliński, Witold Lipski, 'Incomplete Information in Relational Databases'.[2] It is also argued there that 3-valued logic (TRUE, FALSE, MAYBE) can never provide guarantee of complete answer for tables with incomplete information.

Three algebras which satisfy conditions of safety and completeness are defined as Imielinski-Lipski algebras: the Codd-Tables algebra, the V-tables algebra and the Conditional tables (C-tables) algebra.

Codd-tables Algebra

Codd-tables Algebra is based on the usual Codd's singe NULL values. The table T above is an example of Codd-table. Codd-table algebra supports projection and positive selections only. It is also demonstrated in [IL84 that it is not possible to correctly extend more relational operators over Codd-Tables. For example, such basic operation as join is not extendable over Codd-tables. It is not possible to define selections with Boolean conditions involving negation and preserve completeness. For example, queries like the above query Q cannot be supported. In order to be able to extend more relational operators, more expressive form of null value representation is needed in tables which are called V-table.

V-Tables Algebra

V-tables algebra is based on many different ("marked") null values or variables allowed to appear in a table. V-tables allow to show that a value may be unknown but the same for different tuples. For example, in the table below Gaurav and Igor order the same (but unknown) beer in two unknown bars (which may, or may not be different – but remain unknown). Gaurav and Jane frequent the same unknown bad (Y1). Thus, instead one NULL value, we use indexed variables, or Skolem constants .[3]

DRINKERBEERBAR
ZhihanHeinekenCabana
GauravX1Y1
IgorX1Y2
JaneBudY1
JohnX2Y3

V-Tables algebra is shown to correctly support projection, positive selection (with no negation occurring in the selection condition), union, and renaming of attributes, which allows for processing arbitrary conjunctive queries. A very desirable property enjoyed by the V-table algebra is that all relational operators on tables are performed in exactly the same way as in the case of the usual relations.

Conditional Tables (C-Tables) Algebra

Example of conditional table (c-table) is shown below.

NAMECLASSGRADESEMESTERcon
RohitDatabasesBSpringtrue
IgorNetworksAxx ='Spring'
IgorNetworksAxx !='Spring'

It has additional column “con” which is a Boolean condition involving variables, null values – same as in V-tables.

                Select *
                     From T
                          Where CLASS='Networks' & SEMESTER='Spring' OR GRADE = 'A' & SEMESTER !='Spring'

over the following table c-table

T1=
NAMECLASSGRADESEMESTERcon
RohitDatabasesBSpringtrue
IgorNetworksAxtrue

Conditional Tables algebra, mainly of theoretical interest, supports projection, selection, union, join, and renaming. Under closed world assumption, it can also handle the operator of difference, thus it can support all relational operators.

History

Imieliński-Lipski algebras were introduced by Tomasz Imieliński and Witold Lipski Jr. in 'Incomplete Information in Relational Databases' [2]

References

  1. J.D. Ullman (1982). Principles of Database Systems (2nd ed.). Computer Science Press, Potomac, MD.
  2. 1 2 Imieliński, T.; Lipski Jr., W. (1984). "Incomplete information in relational databases". Journal of the ACM. 31 (4): 761–791.
  3. Skolem constants

Further reading

  • Abiteboul, S.; Kanellakis, P.; Grahne, G. (1991). "On the representation and querying of sets of possible worlds" (PDF). Theoretical Computer Science. 78 (1): 159–187.
  • Abiteboul, S.; Duschka, O.M. (1998). "Complexity of Answering Queries Using Materialized Views". Proc. ACM SIGMOD-SIGACT-SIGART, PODS: 254–263.
  • Green, T.J.; Karvounarakis, G.; Tannen, Val (2007). "Provenance Semiring". Proc. ACM SIGMOD-SIGACT-SIGART, PODS: 31–40.
  • Karvounarakis, G.; Green, T.J. (2012). "Semiring-Annotated Data: Queries and Provenance" (PDF). ACM SIGMOD. 41 (3): 5–14.
  • T.J. Green (2009). Models for Incomplete and Probabilistic Information; Chapter 2, in Managing and Mining Uncertain Data. Springer Link.
  • Grahne, G.; Kiricenko, V. (Nov 2004). "Towards an algebraic theory of information integration" (PDF). Information and Computation. 194 (2): 79–100.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.