Programming complexity

Programming complexity (or software complexity) is a term that encompasses numerous properties of a piece of software, all of which affect internal interactions. According to several commentators, there is a distinction between the terms complex and complicated. Complicated implies being difficult to understand but with time and effort, ultimately knowable. Complex, on the other hand, describes the interactions between a number of entities. As the number of entities increases, the number of interactions between them would increase exponentially, and it would get to a point where it would be impossible to know and understand all of them. Similarly, higher levels of complexity in software increase the risk of unintentionally interfering with interactions and so increases the chance of introducing defects when making changes. In more extreme cases, it can make modifying the software virtually impossible. The idea of linking software complexity to the maintainability of the software has been explored extensively by Professor Manny Lehman, who developed his Laws of Software Evolution from his research. He and his co-Author Les Belady explored numerous possible Software Metrics in their oft cited book,^[1] that could be used to measure the state of the software, eventually reaching the conclusion that the only practical solution would be to use one that uses deterministic complexity models.

Measures

Many measures of software complexity have been proposed. Many of these, although yielding a good representation of complexity, do not lend themselves to easy measurement. Some of the more commonly used metrics are

McCabe's cyclomatic complexity metric
Halsteads software science metrics
Henry and Kafura introduced Software Structure Metrics Based on Information Flow in 1981^[2] which measures complexity as a function of fan in and fan out. They define fan-in of a procedure as the number of local flows into that procedure plus the number of data structures from which that procedure retrieves information. Fan-out is defined as the number of local flows out of that procedure plus the number of data structures that the procedure updates. Local flows relate to data passed to and from procedures that call or are called by, the procedure in question. Henry and Kafura's complexity value is defined as "the procedure length multiplied by the square of fan-in multiplied by fan-out" (Length ×(fan-in × fan-out)²).
A Metrics Suite for Object Oriented Design^[3] was introduced by Chidamber and Kemerer in 1994 focusing, as the title suggests, on metrics specifically for object oriented code. They introduce six OO complexity metrics; weighted methods per class, coupling between object classes, response for a class, number of children, depth of inheritance tree and lack of cohesion of methods

There are several other metrics that can be used to measure programming complexity:

Branching complexity (Sneed Metric)
Data access complexity (Card Metric)
Data complexity (Chapin Metric)
Data flow complexity (Elshof Metric)
Decisional complexity (McClure Metric)

Types

Associated with, and dependent on the complexity of an existing program, is the complexity associated with changing the program. The complexity of a problem can be divided into two parts:^[4]

Accidental complexity: Relates to difficulties a programmer faces due to the chosen software engineering tools. A better fitting set of tools or a more high-level programming language may reduce it. Accidental complexity is often also a consequence of the lack of using the domain to frame the form of the solution i.e. the code. One practice that can help in avoiding accidental complexity is domain-driven design.
Essential complexity: Is caused by the characteristics of the problem to be solved and cannot be reduced.

Chidamber and Kemerer Metrics

Chidamber and Kemeber^[3] proposed a set of programing complexity metrics, widely used in many measurements and academic articles. They are WMC, CBO, RFC, NOC, DIT, and LCOM, described below:

WMC - weighted methods per class
- $WMC=\sum _{i=1}^{n}c_{i}$
- n is the number of methods on the class
- $c_{i}$ is the complexity of the method
CBO - coupling between object classes
- number of other class which is coupled (using or being used)
RFC - response for a class
- $RFC=|RS|$ where
- $RS=\{M\}\cup _{all\ i}\{R_{i}\}$
- $R_{i}$ is set of methods called by method i
- $M$ is the set of methods in the class
NOC - number of children
- sum of all classes that inherit this class or a descended of it
DIT - depth of inheritance tree
- maximum depth of the inheritance tree for this class
LCOM- lack of cohesion of methods
- Measures the intersection of the attributes used in common by the class methods
- $LCOM={\begin{cases}|P|-|Q|,&{\text{if }}|P|>|Q|\\0,&{\text{otherwise }}\end{cases}}$
- Where $P=\{(I_{i},I_{j})|I_{i}\cap I_{j}=\emptyset \}$
- And $Q=\{(I_{i},I_{j})|I_{i}\cap I_{j}\neq \emptyset \}$
- With $I_{i}$ is the set of the attributes (variables) used by the nth method of the class

References

↑ MM Lehmam LA Belady; Program Evolution - Processes of Software Change 1985
↑ Henry, S.; Kafura, D. IEEE Transactions on Software Engineering Volume SE-7, Issue 5, Sept. 1981 Page(s): 510 - 518
1 2 Chidamber, S.R.; Kemerer, C.F. IEEE Transactions on Software Engineering Volume 20, Issue 6, Jun 1994 Page(s):476 - 493
↑ In software engineering, a problem can be divided into its accidental and essential complexity [1].

Programming complexity

Measures

Types

Chidamber and Kemerer Metrics

References

See also