Software construction

Software construction is the process of creating working software via coding and integration. The process includes unit and integration testing although does not include higher level testing such as system testing.^[1]

Construction is an aspect of the software development lifecycle and is integrated in the various software development process models with varying focus on construction as an activity separate from other activities. In the waterfall model, a software development effort consists of sequential phases including requirements analysis, design, and planning which are prerequisites for starting construction. In an iterative model such as scrum, evolutionary prototyping, or extreme programming, construction as an activity that occurs concurrently or overlapping other activities.^[1]

Construction planning may include defining the order in which components are created and integrated, the software quality management processes, and the allocation of tasks to teams and developers.^[1]

To facilitate project management, numerous construction aspects can be measured; these include the amount of code developed, modified, reused, and destroyed, code complexity, code inspection statistics, faults-fixed and faults-found rates, and effort expended. These measurements can be useful for aspects such as ensuring quality and improving the process.^[1]

Remove ads

Activities

Summarize

Perspective

Construction includes many activities.

Coding

The following are a few of the key aspects of the coding activity:^[2]

Naming

Choice of name for each identifier. One study showed that the effort required to debug a program is minimized when variable names are between 10 and 16 characters.^[3]

Logic

Organization into statements and routines^[4]

Highly cohesive routines proved to be less error prone than routines with lower cohesion. A study of 450 routines found that 50 percent of the highly cohesive routines were fault free compared to only 18 percent of routines with low cohesion. Another study of a different 450 routines found that routines with the highest coupling-to-cohesion ratios had 7 times as many errors as those with the lowest coupling-to-cohesion ratios and were 20 times as costly to fix.
Although studies showed inconclusive results regarding the correlation between routine sizes and the rate of errors in them, but one study found that routines with fewer than 143 lines of code were 2.4 times less expensive to fix than larger routines. Another study showed that the code needed to be changed least when routines averaged 100 to 150 lines of code. Another study found that structural complexity and amount of data in a routine were correlated with errors regardless of its size.
Interfaces between routines are some of the most error-prone areas of a program. One study showed that 39 percent of all errors were errors in communication between routines.
Unused parameters are correlated with an increased error rate. In one study, only 17 to 29 percent of routines with more than one unreferenced variable had no errors, compared to 46 percent in routines with no unused variables.
The number of parameters of a routine should be 7 at maximum as research has found that people generally cannot keep track of more than about seven chunks of information at once.
One experiment showed that designs which access arrays sequentially, rather than randomly, result in fewer variables and fewer variable references.^[5]
One experiment found that loops-with-exit are more comprehensible than other kinds of loops.^[6]
Regarding the level of nesting in loops and conditionals, studies have shown that programmers have difficulty comprehending more than three levels of nesting.^[6]^[7]
Control flow complexity has been shown to correlate with low reliability and frequent errors.^[7]

Modularity

Structuring and refactoring the code into classes, packages and other structures. When considering containment, the maximum number of data members in a class shouldn't exceed 7±2. Research has shown that this number is the number of discrete items a person can remember while performing other tasks. When considering inheritance, the number of levels in the inheritance tree should be limited. Deep inheritance trees have been found to be significantly associated with increased fault rates. When considering the number of routines in a class, it should be kept as small as possible. A study on C++ programs has found an association between the number of routines and the number of faults.^[8] A study by NASA showed that the putting the code into well-factored classes can double the code reusability compared to the code developed using functional design.^[8]^[4]

Error handling

Encoding logic to handle both planned and unplanned errors and exceptions.

Resource management

Managing computational resource use via exclusion mechanisms and discipline in accessing serially reusable resources, including threads or database locks.

Security

Prevention of code-level security breaches such as buffer overrun and array index overflow.

Optimization

Optimization while avoiding premature optimization.

Documentation

Both embedded in the code as comments and as external documents.

Integration

Integration is about combining separately constructed parts. Concerns include planning the sequence in which components will be integrated, creating scaffolding to support interim versions of the software, determining the degree of testing and quality work performed on components before they are integrated, and determining points in the project at which interim versions are tested.^[1]

Testing

Testing can reduce the time between when faulty logic is inserted in the code and when it is detected. In some cases, testing is performed after code has been written, but in test-first programming, test cases are created before code is written. Construction includes at least two forms of testing, often performed by the developer who wrote the code:^[1] unit testing and integration testing.

Reuse

Software reuse entails more than creating and using libraries. It requires formalizing the practice of reuse by integrating reuse processes and activities into the software life cycle. The tasks related to reuse in software construction during coding and testing may include:^[1] selection of the reusable code, evaluation of code or test re-usability, reporting reuse metrics.

Quality assurance

Techniques for ensuring quality as software is constructed include:^[9]

Testing

One study found that the average defect detection rates of Unit testing and integration testing are 30% and 35% respectively.^[10]

Software inspection

With respect to software inspection, one study found that the average defect detection rate of formal code inspections is 60%. Regarding the cost of finding defects, a study found that code reading detected 80% more faults per hour than testing. Another study shown that it costs six times more to detect design defects by using testing than by using inspections. A study by IBM showed that only 3.5 hours were needed to find a defect through code inspections versus 15–25 hours through testing. Microsoft has found that it takes 3 hours to find and fix a defect by using code inspections and 12 hours to find and fix a defect by using testing. In a 700 thousand lines program, it was reported that code reviews were several times as cost-effective as testing.^[10] Studies found that inspections result in 20% - 30% fewer defects per 1000 lines of code than less formal review practices and that they increase productivity by about 20%. Formal inspections will usually take 10% - 15% of the project budget and will reduce overall project cost. Researchers found that having more than 2 - 3 reviewers on a formal inspection doesn't increase the number of defects found, although the results seem to vary depending on the kind of material being inspected.^[11]

Technical review

With respect to technical review, one study found that the average defect detection rates of informal code reviews and desk checking are 25% and 40% respectively.^[10] Walkthroughs were found to have a defect detection rate of 20% - 40%, but were found also to be expensive especially when project pressures increase. Code reading was found by NASA to detect 3.3 defects per hour of effort versus 1.8 defects per hour for testing. It also finds 20% - 60% more errors over the life of the project than different kinds of testing. A study of 13 reviews about review meetings, found that 90% of the defects were found in preparation for the review meeting while only around 10% were found during the meeting.^[11]

Static analysis

With respect to Static analysis (IEEE1028), studies have shown that a combination of these techniques needs to be used to achieve a high defect detection rate. Other studies showed that different people tend to find different defects. One study found that the extreme programming practices of pair programming, desk checking, unit testing, integration testing, and regression testing can achieve a 90% defect detection rate.^[10] An experiment involving experienced programmers found that on average they were able to find 5 errors (9 at best) out of 15 errors by testing.^[12]

80% of the errors tend to be concentrated in 20% of the project's classes and routines. 50% of the errors are found in 5% of the project's classes. IBM was able to reduce the customer reported defects by a factor of ten to one and to reduce their maintenance budget by 45% in its IMS system by repairing or rewriting only 31 out of 425 classes. Around 20% of a project's routines contribute to 80% of the development costs. A classic study by IBM found that few error-prone routines of OS/360 were the most expensive entities. They had around 50 defects per 1000 lines of code and fixing them costs 10 times what it took to develop the whole system.^[12]

Re-design

In order to account for the unanticipated gaps in the software design, design modifications may be made during construction.^[13]

Remove ads

Language

Types of languages used for construction include:^[14]

General-purpose programming language – the most flexible type of language
Configuration language – developers choose from a limited set of options to create a custom software installation
Toolkit language – used to build applications out of toolkits

Programmers working in a language they have used for three years or more are about 30 percent more productive than programmers with equivalent experience who are new to a language. High-level languages such as C++, Java, Smalltalk, and Visual Basic yield 5 to 15 times better productivity, reliability, simplicity, and comprehensibility than low-level languages such as assembly and C. Equivalent code has been shown to need fewer lines to be implemented in high level languages than in lower level languages.^[15]

Remove ads

Best practices

Summarize

Perspective

Many factors contribute to software quality and minimize cost of ownership.

Minimize complexity

Minimizing programming complexity is mainly driven by the limited ability of people to effectively process complex information. Complexity can be reduced via construction-focused quality techniques.^[16]

Anticipate change

Anticipating change helps developers build extensible software – code that can be enhanced without disrupting the inherent design.^[16] Research over 25 years shows that the cost of rework can be 10 to 100 times (5 to 10 times for smaller projects) more expensive than getting the requirements right the first time. Given that 25% of the requirements change during development on average project, the need to reduce the cost of rework elucidates the need for anticipating change. ^[17]

Construct for verification

Constructing for verification means building software in such a way that faults can be ferreted out readily by the developers as well as during independent testing and operational activities. Specific techniques that support constructing for verification include following coding standards to support code reviews, unit testing, organizing code to support automated testing, and restricted use of complex or hard-to-understand language structures, among others.^[16]

Information hiding

Information hiding proved to be a useful design technique in large programs that made them easier to modify by a factor of 4. Low fan-out is one of the design characteristics found to be beneficial by researchers.^[18]

Reuse

Software reuse can realize significant productivity, quality, and cost benefits. The primary benefits are achieved by reusing existing software assets, and reuse is supported by creating software designed for future reuse.^[16]

Standards

Standards, whether external (created by international organizations) or internal (created at the corporate level), that directly affect construction issues include:^[16]

Documentation standards for format and content
Modelling standards such as UML
Coding standards

Data abstraction

Data abstraction is a characteristic of source code that represents information in a form that is similar to its meaning, while hiding implementation details.^[19] Academic research showed that data abstraction makes programs about 30% easier to understand than functional programs.^[8]

Object-oriented languages support a series of runtime mechanisms that increase the flexibility and adaptability of the programs like data abstraction, encapsulation, modularity, inheritance, polymorphism, and reflection.^[20]^[21]

Defensive programming

Defensive programming is the protection a routine from being broken by invalid inputs.^[22] Assertions are executable predicates which are placed in a program that allow runtime checks of the program.^[20] Design by contract is a development approach in which preconditions and postconditions are included for each routine.

Error handling

Error handling refers to the practice of coding for error conditions that may arise when a program runs. Exception handling is a programming-language construct or hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution.^[23] Fault tolerance is a collection of techniques that increase software reliability by detecting errors and then recovering from them if possible or containing their effects if recovery is not possible.^[22]

Remove ads

Coding aspects

State-based logic

State-based programming consists of using a finite state machine to implement logic.^[22]

Table-driven logic

Table-driven logic uses information formatted as a table to drive execution.^[24]

Runtime configuration

Runtime configuration is a technique that binds variable values and program settings when the program is running, usually by updating and reading configuration files.

Internationalization

Internationalization and localization is the activity of preparing a program to support multiple locales and supporting various locales.^[24]

Remove ads

Notes

Loading content...

References

Loading content...

External links

Loading content...

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads

Activities

Coding

Integration

Testing

Reuse

Quality assurance

Re-design

Language

Best practices

Minimize complexity

Anticipate change

Construct for verification

Information hiding

Reuse

Standards

Data abstraction

Defensive programming

Error handling

Coding aspects

State-based logic

Table-driven logic

Runtime configuration

Internationalization

See also

Notes

References

External links