Is the process of reviewing program code?

The high cost of code reviews and reviewing having benefits that may not match the assumptions, often leads us to using them in our workflows in ways that are not efficient.

For example, requiring two sign-offs for all code changes without discrimination will make costs exceed the benefits of code reviewing in at least some of the cases. Moreover, since code reviews find commit blocking defects relatively infrequently, it might be prudent to change the practices to better fit that finding. One of Microsoft’s large teams recently instituted a policy in which a developer is allowed to proceed with a commit after the very first code review sign-off. If there are more comments coming after that, another commit can be made to finalize the change.

Read moreNavigate Down

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128042069000246

Synthesizing Knowledge from Software Development Artifacts

Olga Baysal, ... Michael W. Godfrey, in The Art and Science of Analyzing Software Data, 2015

4.4.2 WebKit

The lifecycle model can be easily modified according to the dataset at hand. For example, we have applied the pattern to study the code review process of the WebKit project [11]. The model of the patch lifecycle is shown in Figure 4.4.

Is the process of reviewing program code?

Figure 4.4. WebKit’s patch lifecycle.

Since WebKit is an industrial project, we were particularly interested to compare its code review process to that of Mozilla, which was run in more traditional open source development style. To do so, we extracted WebKit’s patch lifecycle (Figure 4.4) and compared it with the previously studied patch lifecycle of Mozilla Firefox [8] (Figure 4.2).

The patch lifecycle captures the various states patches undergo during the review process, and characterizes how the patches transition between these states. The patch lifecycles enable large data sets to be aggregated in a way that is convenient for analysis. For example, we were surprised to discover that a large proportion of patches that have been marked as accepted are subsequently resubmitted by authors for further revision. We can also see that rejected patches are usually resubmitted, which might ease concerns that rejecting a borderline patch could cause it to be abandoned.

While the set of states in our patch lifecycle models of both WebKit and Firefox are the same, WebKit has fewer state transitions; this is because the WebKit project does not employ a “super-review” policy. Furthermore, unlike in Mozilla, there are no self-edges on the “Accepted” and “Rejected” states in WebKit; this is because Mozilla patches are often reviewed by two people, while WebKit patches receive only individual reviews. Finally, the WebKit model introduces a new edge between “Submitted” and “Resubmitted”; WebKit developers frequently “obsolete” their own patches and submit updates before they receive any reviews at all. One reason for this behavior is that submitted patches can be automatically validated by the external test system and developers can thus submit patches before they are to be reviewed to see if they fail any tests. All together, however, comparing the two patch lifecycles suggests that the WebKit and Firefox code review processes are fairly similar in practice.

Read moreNavigate Down

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780124115194000045

A Mixed Methods Approach to Mining Code Review Data

Peter C. Rigby, ... Murtuza Mukadam, in The Art and Science of Analyzing Software Data, 2015

9.3.6 Data Measures and Attributes

A code review is effective if the proposed changes are eventually accepted or bugs are prevented, and it is efficient if the time this takes is as short as possible. To study patch acceptance and rejection and the speed of the code review process, a framework for extracting meta-information about code reviews is required. Code reviewing processes are common in both OSS [5, 9] and commercial software [6, 7] development environments. Researchers have identified and studied code review in contexts such as patch submission and acceptance [17, 20–22] and bug triaging [23, 24]. Our metamodel of code review features is presented in Table 9.1. To develop this metamodel we included the features used in existing work on code review. This metamodel is not exhaustive, but it forms a basis that other researchers can extend. Our metamodel features can be split in three broad categories:

Table 9.1. Metamodel for Code Review Analysis

FeatureDescriptionCode review featuresnum_commitsNumber of commits in the proposed changesrc_churnNumber of lines changed (added and deleted) by the proposed changetest_churnNumber of test lines changed in the proposed changefiles_changedNumber of files touched by the proposed changenum_commentsDiscussion and code review commentsnum_participantsNumber of participants in the code review discussionProject featuresslocExecutable lines of code when the proposed change was createdteam_sizeNumber of active core team members during the last 3 months prior to the proposed change creationperc_ext_contribsThe ratio of commits from external members over core team members in the last n monthscommits_files_touchedNumber of total commits on files touched by the proposed change n months before the proposed change creation timetest_lines_per_klocA proxy for the project’s test coverageDeveloperprev_changesNumber of changes submitted by a specific developer, prior to the examined proposed changerequester_succ_ratePercentage of the developer’s changes that have been integrated up to the creation of the examined proposed changereputationQuantification of the developer’s reputation in the project’s community (e.g., followers on GitHub)

1.

Proposed change features. These characteristics attempt to quantify the impact of the proposed change on the affected code base. When external code contributions are examined, the size of the patch affects both acceptance and the acceptance time [21]. Various metrics have been used by researchers to determine the size of a patch: code churn [20, 25], changed files [20], and number of commits. In the particular case of GitHub pull requests, developers reported that the presence of tests in a pull request increases their confidence to merge it Pham et al. [26]. The number of participants has been shown to influence the amount of time taken to conduct a code review [7].

2.

Project features. These features quantify the receptiveness of a project to an incoming code change. If the project’s process is open to external contributions, then we expect to see an increased ratio of external contributors over team members. The project’s size may be a detrimental factor to the speed of processing a proposed change, as its impact may be more difficult to assess. Also, incoming changes tend to cluster over time (the “yesterday’s weather” change pattern [27]), so it is natural to assume that proposed changes affecting a part of the system that is under active development will be more likely to merge. Testing plays a role in the speed of processing; according to [26], projects struggling with a constant flux of contributors use testing, manual or preferably automated, as a safety net to handle contributions from unknown developers.

3.

Developer. Developer-based features attempt to quantify the influence that the person who created the proposed change has on the decision to merge it and the time to process it. In particular, the developer who created the patch has been shown to influence the patch acceptance decision [28] (recent work on different systems interestingly reported opposite results [29]). To abstract the results across projects with different developers, researchers devised features that quantify the developer’s track record [30]—namely, the number of previous proposed changes and their acceptance rate; the former has been identified as a strong indicator of proposed change quality [26]. Finally, Bird et al. [31], presented evidence that social reputation has an impact on whether a patch will be merged; consequently, features that quantify the developer’s social reputation (e.g., follower’s in GitHub’s case) can be used to track this.

Read moreNavigate Down

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780124115194000094

Microsoft Vista: An Overview

In Microsoft Vista for IT Security Professionals, 2007

Summary

Windows Vista represents Microsoft’s view of the future of computing. With Vista Microsoft wants to improve the user’s experience, as well as change the minds of the public about the insecurity and reliability of Windows operating systems. Microsoft has taken a hard stance on security through an improved code review process and the employment of third-party individuals to scrutinize its software. The implementation of innovative security controls such as ASLR and a layered approach to total system security has also helped Microsoft produce a much more secure OS.

Microsoft is also seeing the benefit of having a code base that is now maturing and becoming more robust with each release and service pack. Vista also includes several mature add-on products such as Internet Explorer 7 and Windows Media Player 11. IE 7 offers users improved security and a much more efficient user interface with features such as tabbed browsing and the phishing filter. Are these new features enough to warrant an upgrade in the home or office, though? This is the battle that Microsoft must now fight. Wide adoption of Vista will be slow at first, and the main source of new Vista machines will not be users upgrading their current systems. Most new Vista machines will be from OEMs such as Dell and Gateway. Another factor will be the fact that DirectX (DX) 10 will not be available for Windows XP. Gamers will need to upgrade their systems with new graphics cards as well as Vista to support DX 10 and the new games coming out that will take advantage of DX 10.

The next year will be interesting for Microsoft and the IT world in general. Most sys admins will choose to wait before fully adopting Vista as their platform of choice. What happens in the first year, how many security flaws and bugs are found, and how Microsoft responds to those issues will greatly affect the choices of many IT departments.

Read moreNavigate Down

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9781597491396500054

Validating the Architecture

Murat Erder, Pierre Pureur, in Continuous Architecture, 2016

Code Inspections

In addition to the decision-based architecture checkups, the team also conducts periodic code inspections. Most of those reviews are automated using static code analysis tools as part of the continuous deployment process (see Chapter 5 for a discussion of the continuous deployment process), but there may be times when a manual evaluation is required to supplement the static code analysis tools, for example, when a component is unusually complex or exhibits some issues when performance testing the system. These reviews are simple checklist-based validations that essentially ensure that the architecture decisions have been properly implemented in the code and that the code is well written and easy to understand.

Code Inspections

Code inspections can be achieved by either manual code reviews or by using static analysis tools.

Manual Code Reviews

A code review process is much simpler than an architecture review. A team of experts gets together with the author of the code and manually inspects that code to discover defects. It is, of course, much more efficient to discover defects before a system is deployed than after deployment.

Depending on the programming language being used, a typical code inspection review would look at 200 to 400 lines of code, so a decision needs to be made before the meeting on which component needs to be manually inspected.

Static Code Analysis

Static program analysis is the analysis of computer software that is performed without actually executing programs (analysis performed on executing programs is known as dynamic analysis). In most cases the analysis is performed on some version of the source code, and in the other cases, some form of the object code. The term is usually applied to the analysis performed by an automated tool, with human analysis being called program understanding, program comprehension, or code review.5,6

A number of static code analysis tools are available to save time on most code reviews. Their advantage is that they are able to inspect 100% of the code, but they may not be able to find every defect that an expert would, assuming the expert had time to inspect 100% of the code. Static code analysis tools can be either open source or proprietary. Please see Wikipedia7 for a list of tools available for each commonly used programming language.

Read moreNavigate Down

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128032848000063

Enterprise Web Application Testing

Shailesh Kumar Shivakumar, in Architecting High Performing, Scalable and Available Enterprise Web Applications, 2015

Defect prevention in various project phases

Proactive defect prevention includes various measures to incorporate the defect prevention practices into the project lifecycle. It also includes analyzing the historical defect trends and designing an effective prevention strategy to avoid recurrence. The list of defect prevention measures is given below:

Development phase: At this stage, we get the opportunity to prevent the defects at their source. Use the coding checklist, such as a JS/widget development checklist, to ensure that code complies with optimal coding standards. Code should be testable and it should undergo a thorough review process including peer review. Incorporate tool-based static code analysis and automate the code review process to detect code-related issues early. Ensure good code coverage from automated and manual testing.

Defect modeling and prediction: Use the historical data of defects from previous releases to gain insights into:

A component that has a high percentage of defects

The category of defect root causes

Defect trends and patterns

Once we analyze the entire defect category and its root cause, we can develop the defect modeling mechanism to predict future defects. The prediction can be used to take proactive and corrective actions. Some examples of problem categories and preventive corrective actions are listed in (Table 6.3).

Table 6.3. Problem analysis and corrective measures

Problem category and analysisPreventive corrective measure8% of defects in past three releases are related to the login module•

Re-review the design of the login module and check for parameters such as “separation of concerns,” “coupling,” and “reusability”

Ensure that test coverage for the login module is 90% to ensure that all functionality is covered

Profile the login module component to understand the memory, CPU utilization, and other runtime behavior in different load conditions. Optimize the code based on profiling

Review in more detail the classes for which the defect rate is high, and look for potential issues

20% of defects are due to “improper requirements understanding”•

Make requirements management process more robust by adding various view of requirements such as use cases, flow diagrams, process diagrams, and business rules

Track the requirements to its implementation and test cases using “requirements traceability matrix”

Initiate a “requirement champion” program, which encourages the development team and QA team to have a thorough and complete understanding of the overall application

10% of defects are caused by build errors and regression issues•

Automate the build and deployment process

Adopt continuous build and validation

Add automatic regression testing as part of the build process and monitor the build issues through the project dashboard

Proactive process efficiency validation: To have a positive impact on usability and the user experience, we need to incorporate features that perform the user task efficiently. This includes features such as on-demand data loading, client-side aggregation, quicker process alternatives, and improvised perceived page performance. This efficiency has positive impacts on the end-user experience.

Integration phase: Build and integrate continuously. This iterative strategy allows the testing team to test the interface-related test cases early and catch performance-related issues. Automate and add all important test cases with a continuous build process. Profile the integrated application to identify memory leak or performance issues.

Perform requirement validation: Build a requirement traceability matric to ensure that functionality is completely and correctly built as per provided specifications. Test the NFR requirements such as performance and scalability test cases early in the game.

Perform continuous security assessments including risk analysis, early security testing, and continuous threat assessment.

Proactive defect detection mostly happens postproduction. This helps in providing an early fix to reduce the overall impact of the issue:

Continuous real-time SLA monitoring: The critical SLAs of enterprise applications should be monitored continuously post “go-live.” For a global application, a multi-geo robots-based monitoring method is needed to get the real performance numbers across geographies. The SLAs usually monitored include perceived page performance, average page response time, component load time, and application availability.

Proactive user feedback solicitation: In some scenarios, it is worthwhile to conduct user survey and opinion polls to solicit feedback from the end customers, in order to understand the usability of the site and the overall satisfaction index.

An automatic alert and notification infrastructure should alert the site administrators if the application SLA falls below the configured threshold.

Analytics-driven insight gathering: Web analytics tagging should be incorporated in critical success paths and for components in the application. The reports generated would provide crucial insights into user behavior and usage of the components.

Internal system health check monitoring: Since enterprise applications rely on internal systems for services and feeds, all those upstream systems should be monitored in real time, using a heartbeat monitoring mechanism. This should be coupled with CPU/memory and network monitoring of all systems. Any service or system outages should prompt immediate notification.

Proactive log monitoring: Monitor the application and server logs on a regular basis for early identification of any issues, and take the corrective measures.

Read moreNavigate Down

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128022580000068

Secure Development Life Cycle

Zhendong Ma, ... Paul Murdock, in Smart Grid Security, 2015

8.2.2 Microsoft Security Development Lifecycle

The Microsoft Security Development Lifecycle (MSDL) (Howard, 2006) is arguably the best known security development lifecycle specification. In essence, MSDL is a software development security assurance process that aims to integrate security and privacy practices into all phases of software development lifecycles. Originally introduced to improve the security of Microsoft software products, it has gained a wide recognition, eventually being adopted by software developers outside Microsoft.

Shown in Figure 8.2, the main part of MSDL includes five phases: requirements, design, implementation, verification, and release. In the requirements phase, security and privacy requirements analysis produces a preliminary requirement specification. Quality gates and bug bars are used to define the criteria for acceptable security and privacy levels to be met in the development lifecycle. Risk assessment is used to identify areas of focus of a software project that needs special attention and hence security and privacy measures. Such a risk-driven approach ensures cost-effectiveness in the planning and distribution of project efforts.

Is the process of reviewing program code?

Figure 8.2. Microsoft security development lifecycle.

In the design phase, design requirements generate design specifications for security and privacy and security features. The activity of attack surface reduction minimises risks related to potential vulnerabilities and exploits. Threat modelling is used to analyse and identify security risks of software architecture in a structured way.

In the implementation phase, a development toolset with approved security properties, e.g. a toolset without known security flaws, is defined by the developers. Deprecating old and potentially unsafe APIs will minimise the risk of security holes in the software. Static analysis of source code is used as a part of the code review process for the implementation.

In the verification phase, dynamic program analysis and fuzz testing verifies the software in a run-time environment. The verification is completed with a re-review of the threat models and the attack surface to ensure that implementations follow the design specifications.

In the release phase, an incident response plan ensures that security will be continuously maintained after the release. A final security review examines all security related activities in the project. Release and archive are control steps in which a software is certified to fulfil security and privacy requirements.

Threat modelling (Swiderski, 2004) is proposed as a recurring procedure in MSDL in order to identify threats and their impact on a system design. In general, threat modelling specifies system in data flow diagrams and enumerates all possible threats, estimate their impacts on the system, and optionally list their possible mitigations.

Originally, threat modelling was demonstrated as an approach for developing web applications. It is divided into six steps: (1) identify assets, (2) create an architecture overview, (3) decompose the application, (4) identify the threats, (5) document the threats, and (6) rate the threats. The output of threat modelling is a document with a definition of the architecture of the application and a list of threats for the application scenario.

In the first step, information assets that need to be protected are identified. The second step is to find out the function, architecture, configuration, and the technologies of the system. Three tasks are suggested, which are to identify the function of the application by use cases, create a high-level architecture diagram, and identify the technologies and their implementation details. For complex systems, several diagrams for subsystems might be needed for describing the whole system.

In the third step, the main tasks are to identify trust boundaries, data flow, entry points, and privileged code. The findings are documented in a so-called “security profile”. A trust boundary can be the boundary of an organization’s IT infrastructure, in which all units follow and implement the same security policies. Trust boundaries can also be the boundaries in which all subsystems are mutually authenticated and authorized. Threat modelling proposes to analyse an application’s data flow at a high level and decompose it iteratively down to subsystems or sub-subsystems. Entry points are the attack surface where attacks might happen. Analysing and looking for privileged code in a system can help to identify code that runs privileged operations or accesses restricted resources.

A Data flow diagram (DFD) is a modelling method to make this analysis easier and more intuitive. A DFD uses designated symbols to represent external entities, processes and multi-processes, data stores, data flows, and trust boundaries. A process is any software component involved in data processing. Figure 8.3 shows an example of a DFD that is used in threat modelling for a Web application. In the figure, a double circle represents a “multiple process”, a collection of applications to process data or perform a certain action. A single circle represents a single process. A square represents an external entity, e.g. a user or a third-party Web application. The parallel lines represent a data store. In this example, the data stores are two databases and one image of the database. A dotted curve is used to annotate the trust boundary. An arrow represents data flow in the system.

Is the process of reviewing program code?

Figure 8.3. An example DFD for web application threat modelling.

In the fourth step, threats are identified and enumerated for each of the component in the DFD. The red lines in the example in Figure 8.3 indicate the identified potential threats in the Web application. The Microsoft STRIDE method is used to help software developers identify threats. The mnemonic STRIDE stands for Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, and Elevation of privilege. This six-letter acronym is used as a guideword for identifying network, host-based, and application threats. Alternatively, if a categorized threat list is available, it can also be used for threat identification. With the help of the STRIDE method or threat taxonomy, the process for threat identification can leverage existing threat knowledge and make the process systematic and efficient with reduced error and uncertainty.

Other approaches such as attack tree or attack patterns are also suggested to facilitate threat identification. Attack trees (Schneier, Attack Trees, December 1999) are a structured way to identify potential threats to an information security asset. An attack tree is literally a tree structure, usually represented graphically, to identify possible attacks and establish their relationships. They consist of nodes and edges. Nodes in an attack tree represent an attacker’s actions. Attack trees use a root node to specify an attacker’s goal and systematically expand the tree with leaf nodes to enumerate possible attacks that contribute to reach the goal. The leaf nodes are grouped by logical AND and OR relations. Attack trees provide a structured way of security analysis and have the potential to identify hidden threats. For each attack goal, a new tree needs to be constructed.

Although attack trees can be useful to enumerate the threats related to a system design in a systematic way, the analysis can be very tedious and does not scale well to large systems. To make the process more efficient, re-usable attack patterns are proposed to assist the process. Attack patterns are an abstract description of common attacks to information security. Since patterns capture and describe recurring problems and solutions, attack patterns aim to capture the core of methods and procedures associated with exploiting a computer system (Moore, 2001). At the same time, since attack patterns do not include specific details about actual exploits, they avoid the danger of providing full details to a malicious user, who may directly apply the attacks.

The fifth step in the threat modelling process requires documentation of the list of the identified threats and their description. In the sixth step, risks associated with the threats are calculated as the product of probability and damage potential. The results are rated. The estimation of probability and damage (i.e. impact) can be expressed numerically, or be converted from quantitative measurements, e.g. use High, Medium, and Low and then convert them to 3, 2, and 1, respectively. Although the ratings can be subjective and imprecise, influenced by the knowledge and experience of the one conducted the analysis, they provide a viable way to rank and prioritize threats.

In general, threat modelling is designed as a “lightweight” software security approach for system designers to make secure design choices about technologies and functions, for developers to write code to mitigate security risks, and for testers to write better test cases to test security requirements.

Microsoft SDL is endorsed and followed by many individuals and organisations in the industry. It has been shown that strong encryption, authentication, and authorisation were often poorly implemented in smart meters (Pennell, 2010). Therefore, besides adding layered defences, the use of SDL will proactively support the implementation of security and privacy measures during a Smart Grid product’s development, third-party auditing, and final software review.

The adoption of the Microsoft SDL for secure development of smart meters has been demonstrated in a case study (Microsoft, 2012). As it is shown, SDL deployment requires a dramatic change in the way of thinking, in which the engineers and developers learned to find ways to stop things from working instead of the conventional mind set of making things work. During the development lifecycle, security risks at all layers of the smart meter functionality are identified in brainstorming sessions. The identified threats are analysed and mitigations are deployed. The threats are then tested to provide evidence of mitigation. The subsystems of the meter, as well as their integration are considered to provide end-to-end security. During the development phase, layers of mitigations are added iteratively until the threats and vulnerabilities are reduced to the point of marginal concern.

Read moreNavigate Down

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128021224000080

A review of code reviewer recommendation studies: Challenges and future directions

H. Alperen Çetin, ... Eray Tüzün, in Science of Computer Programming, 2021

Abstract

Code review is the process of inspecting code changes by a developer who is not involved in the development of the changeset. One of the initial and important steps of code review process is selecting code reviewer(s) for a given code change. To maximize the benefits of the code review process, the appropriate selection of the reviewer is essential. Code reviewer recommendation has been an active research area over the last few years, and many recommendation models have been proposed in the literature.

In this study, we conduct a systematic literature review by inspecting 29 primary studies published from 2009 to 2020. Based on the outcomes of our review: (1) most preferred approaches are heuristic approaches closely followed by machine learning approaches, (2) the majority of the studies use open source projects to evaluate their models, (3) the majority of the studies prefer incremental training set validation techniques, (4) most studies suffer from reproducibility problems, (5) model generalizability and dataset integrity are the most common validity threats for the models and (6) refining models and conducting additional experiments are the most common future work discussions in the studies.

What is the process of code review in your project?

Code Review is an integral process of software development that identifies bugs and defects before the testing phase. Code review is often overlooked as an ongoing practice during the development phase, but countless studies show it's the most effective quality assurance strategy.

What are the 3 types of coding reviews?

Code review practices fall into three main categories: pair programming, formal code review and lightweight code review.

What is code review in detail coding?

Code review is a software quality assurance process in which software's source code is analyzed manually by a team or by using an automated code review tool. The motive is purely, to find bugs, resolve errors, and for most times, improving code quality.

Why code is reviewed?

Code review helps developers learn the code base, as well as help them learn new technologies and techniques that grow their skill sets.