The Clean Coder

Robert C. Martin

Mentioned 4

Presents practical advice on the disciplines, techniques, tools, and practices of computer programming and how to approach software development with a sense of pride, honor, and self-respect.

More on Amazon.com

Mentioned in questions and answers.

I work in a computational biology lab, where we have several folks working on multiple projects, mostly in R (which is what I care about for this post). In the past, people would simply develop their code for each project, which may or may not involve boilerplate code copied over from previous projects. One thing that I've pushed over the years was to bring some centralized structure to this mess and have people identify common patterns such that we can turn these repeated/common blocks of code into packages for all of the many reasons one might think that is a good thing to do. So now our folks are using a mix of centralized packages/routines within their project specific scripts.

There's one gotcha here. We have a mandate from the powers that be that every script for every project need to be 100% reproducible over time to the best of our ability (and this includes 100% of all code we have direct access to, including our packages). That is, if I call function foo in package bar with parameter A to get result X today, 4 years from now I should get the exact same result. (erroneous output due to bugs is excepted here)

The topic of reproducibility has come up now and then in R within various circles, but typically it seems to be discussed in terms of reproducibility of process (e.g. vignettes). This is not the same thing - I can run a vignette today and then run the same code 6 months from now using updated packages and receive wildly different results.

The solution that's been agreed upon (which I'm not a fan of) is that if a function or package needs to be changed in a non-backwards compatible change that it simply gets a new name. Thus, if we needed to radically change function foo(), it'd be called foo2(), and if that needs a radical change it gets called foo3(). This ensures that any script that called foo() will always get the original result, while allowing things to march forward within the package repository. It works, but I really dislike this - it seems aesthetically extremely cluttered, and I worry that it will lead to mass confusion over time having packages bar, bar2, bar3, bar4 ... functions foo1, foo2, foo3, etc.

The problem is that I haven't come up with an alternate solution that's really better. One possibility would be to note version numbers of packages, R, etc and make sure those are loaded, but that has multiple problems - not the least of which is that it relies on proper package versioning discipline and that's prone to error. Also, this alternative was already rejected ;) Ideally what we'd have is some sort of notion of devel & release as most of these changes tend to happen earlier on and then level off with changes happening much less frequently. OTOH what devel really means here is "not actually in a package yet" (which we do), but it can be hard to determine exactly at what point is the right one to transport stuff over. Invariably the moment you think you're safe, that's when you realize you're not.

So with all this in mind, I'm curious if anyone else out there has dealt with similar situations, and how they might have resolved things.

edit: just to be clear, by non-backwards compatible, I'm not just talking about APIs and such, but also outputs for a given set of inputs.

A solution might be to use S4 methods and letting R's internal dispatcher do the work for you (see example below). That way, you're somewhat "bulletproof" with respect to being able to systematically update your code without running the risk of breaking something.

Key benefits

The key thing here is that S4 method support multiple dispatch.

That way your function will always be foo (as opposed to having to keep track of foo1, foo2 etc.) while new functionality can be easily implemented (by adding respective methods) without touching "old" methods (that other people/packages might rely on).

Key functions you'll need:

  • setGeneric
  • setMethod
  • setRefClass (S4 Reference Classes; personal recommendation) or setClass (S4 Class; I wouldn't use them for the reason described in the "Additional remarks" at the very end)

The "downsides"

  • You need to switch from a S3 to a S4 logic

  • This implies that you need to write a bit more code than what you might be used to (generic method definitions, method definitions and possibly own class defitions (see example below). But this "buys" yourself and your code much more structure and makes it more robust.

  • It might also imply that you'll eventually dig deeper and deeper into the world of Object-Oriented Programming or Object-Oriented Design. While I personally consider this to be a good thing (my personal rule of thumb: the more complex/distributed your application, the better you're off using OOP), some would consider these approaches to be R-untypic (I strongly disagree as R does have superb OO-features that are maintained by the Core Team) or "unsuited" for R (this might be true depending on how much you rely on "non-OOP" packages/code). If you're willing to go that way, you might want to familiarize yourself with the SOLID principles of Object-Oriented Design. You also might want to check out the following books: Clean Coder and The Pragmatic Programmer.

  • If computational efficiency (e.g. when estimating statistical models) is really critical, using S4 methods and S4 Reference Classes might slow you down a bit. After all, there's more code involved compared to S3. But I'd recommend testing the impact of this from case to case via system.time() and/or microbenchmark::microbenchmark() instead of picking "ideological" sides (S3 vs. S4).


Example

Initial function

Let's suppose you're in department A and someone in your team started out with creating a function called foo()

foo <- function(x, y) {
    x + y
}
foo(x=10, y=20)

First change request

You would like to be able to extend it without breaking "old" code that relies on foo().

Now, I think we all agree that this can be quite hard to do.

You either need to explicitly modify the source code of foo() (each time running the risk that you break something that already used to work; this violates the "O" in SOLID: Open Closed-Principle) or you need to come with alternative names such as foo1, foo2 etc (really hard to keep track of which function is doing what).

foo <- function(x, y, type=c("old", "new")) {
    type <- match.arg(type, choices=c("old", "new")) 
    if (type == "old") {
        x + y
    } else if (type == "new") {
        x * y    
    }
}
foo(x=10, y=20)
[1] 30
foo(x=10, y=20, type="new")
[1] 200

foo1 <- function(x, y) {
    x * y
}
foo1(x=10, y=20)
[1] 200

Let's see how S4 methods and multiple dispatch can really help us out here.

Generic method

You need to start out by turning foo() into a generic method.

setGeneric(
    name="foo",
    signature=c("x", "y", ".ctx", ".ns"),
    def=function(x, y, ..., .ctx, .ns) {
        standardGeneric("foo")
    }
)

In simplified words: a generic method itself doesn't do anything yet. It's simply a precondition in order to be able to specifiy "actual" methods for its signature arguments that do something useful.

Signature arguments

The degree of flexiblity with respect to the original problem is directly linked to the number of signature arguments that you declare (signature=c("x", "y", ".ctx", ".ns")): the more signature arguments, the more flexiblity you have but the more complex your code might get as well (with respect to how much code you have to write).

Again, in simplified words: signature arguments (and it's classes) are used by the method dispatcher to retrieve the correct method that's doing the actual work.

Think of the method dispatcher being like the clerk in a ski rental business: you present him an arbitrary large set of signature information (i.e. information that "clearly distinguish you from others": your age, height, shoe size and skill level) and he uses that information to provide you with the right equipment to hit the slopes. Think of R's method dispatcher as beeing the clerk that has access to the storage room of the ski rental. But instead of ski equipment it will return methods.

Notice that we said that our "old" arguments x and y are from now on supposed to be signature arguments while there are also two new arguments: .ctx and .ns. I'll get to these in a minute. It's those arguments that will provide us with the flexibility that we're after.

Initial method definition

We now define a "variant" (a method) of the generic method for the following "signature scenario":

  1. x is numeric
  2. y is numeric
  3. .ctx will just not be provided when calling the method and is thus missing
  4. .ns will just not be provided when calling the method and is thus missing

Think of it as registering your signature information with explicit equipment of the ski rental. Once you did that and ask for your equipment, the only thing the clerk has to do is to go to the storage room and look up which equipment is linked to your personal information.

setMethod(
    f="foo", 
    signature=signature(x="numeric", y="numeric", .ctx="missing", .ns="missing"), 
    definition=function(x, y, ..., .ctx, .ns) {
        x + y
    }
)

When we call foo with this "signature scenario" (asking for the method that we registered for this scenario), the method dispatcher knows exactly which actual method it needs to get out of the storage room:

foo(x=10, y=20)
[1] 30

First update

Now someone from department B comes along, looks at foo(), likes it but decides that foo() needs to be updated (x * y instead of x + y) if it is to be used in his department.

That's when .ctx (short for context) comes into play: it's an argument by which we are able to distinguish application contexts.

Definining a class that represents the new application context

setRefClass("ApplicationContextDepartmentB")

When calling foo(), we'll provide it with an instance of this class (.ctx=new("ApplicationContextDepartmentB"))

Definining a new method for the new application context

Notice how we register signature argument .ctx to our new class ApplicationContextDepartmentB:

setMethod(
    f="foo", 
    signature=signature(x="numeric", y="numeric", 
        .ctx="ApplicationContextDepartmentB", .ns="missing"), 
    definition=function(x, y, ..., .ctx, .ns) {
        out <- x * y
        attributes(out)$description <- "I'm different from the original foo()"
        return(out)
    }
)

That way, the method dispatcher knows exactly that it should return the "new" method instead of the "old" method when we call foo() like this:

foo(x=1, y=10, .ctx=new("ApplicationContextDepartmentB"))
[1] 10
attr(,"description")
[1] "I'm different from the original foo()"

The "old" method is not affected at all:

foo(x=1, y=10)
[1] 30

Second update

Suppose that someone from department C comes along and suggests yet another "configuration" or version for foo(). You can easily provide that withouth breaking anything that you've realized for departments A and B so far by following the same routine as for department B.

But we'll even take it one step further here: we'll define two additional classes that let us distinguish different "namespaces" (that's where .ns comes into play).

Think of namespaces as a way of distinguishing different runtime scenarios for a specific method for a specific application context (i.e. "testing" and "productive mode").

Definining the classes

setRefClass("ApplicationContextDepartmentC")
setRefClass("TestNamespace")
setRefClass("ProductionNamespace")

Definining a new method for the new application context and a "test" scenario

Notice how we register signature arguments .ctx to our new class ApplicationContextDepartmentC and .ns to our new class TestNamespace:

setMethod(
    f="foo", 
    signature=signature(x="character", y="numeric", 
        .ctx="ApplicationContextDepartmentC", .ns="TestNamespace"), 
    definition=function(x, y, ..., .ctx, .ns) {
        data.frame(x, y, test.ok=rep(TRUE, length(x)))
    }
)

Again, the method dispatcher will look up the correct method when calling foo() like this:

foo(x=letters[1:5], y=11:15, .ctx=new("ApplicationContextDepartmentC"), 
    .ns=new("TestNamespace"))
  x  y test.ok
1 a 11    TRUE
2 b 12    TRUE
3 c 13    TRUE
4 d 14    TRUE
5 e 15    TRUE

Definining a new method for the new application context and a "productive" scenario

setMethod(
    f="foo", 
    signature=signature(x="character", y="numeric", 
        .ctx="ApplicationContextDepartmentC", .ns="ProductionNamespace"), 
    definition=function(x, y, ..., .ctx, .ns) {
        data.frame(x, y)
    }
)

We tell the method dispatcher that we now want the method registered for this scenario or namespace like this:

foo(x=letters[1:5], y=11:15, .ctx=new("ApplicationContextDepartmentC"), 
    .ns=new("ProductionNamespace"))

  x  y
1 a 11
2 b 12
3 c 13
4 d 14
5 e 15

Notice that you're free to use the classes TestNamespace and ProductionNamespace anywhere you'd like. These classes are not bound to ApplicationContextDepartmentC in any way, so you can for example also use the for all your other application scenarios.

Additional remarks for method definitions

Something that's often quite usefull is to start out with a method that accepts ANY classes for its signature arguments and define more restrictive methods as your software evolves:

setMethod(
    f="foo", 
    signature=signature(x="ANY", y="ANY", .ctx="missing", .ns="missing"), 
    definition=function(x, y, ..., .ctx, .ns) {
        message("Value of x:")
        print(x)
        message("Value of y:")
        print(y)
    }
)
foo(x="Hello World!", y=rep(TRUE, 3))
Value of x:
[1] "Hello World!"
Value of y:
[1] TRUE TRUE TRUE

Additional remarks for class definitions

I prefer S4 Reference Classes over S4 Classes because of the self-referencing capabilities of S4 Reference Classes:

setRefClass(
    Class="A", 
    fields=list(
        x1="numeric",
        x2="logical"
    ),
    methods=list(
        getX1=function() {
            .self$x1
        },
        getX2=function() {
            .self$x2
        },
        setX1=function(x) {
            .self$x1 <- x
        },
        setX2=function(x) {
            .self$field("x2", x)
        },
        addX1AndX2=function() {
            .self$getX1() + .self$getX2()
        }
    )
)
x <- new("A", x1=10, x2=TRUE)
x$getX1()
[1] 10
x$getX2()
[1] TRUE
x$addX1AndX2()
[1] 11

S4 Classes don't have that feature.

Subsequent modifications of field values:

x$setX1(100)
x$addX1AndX2()
[1] 101
x$x1 <- 1000
x$addX1AndX2()
[1] 1001

Additional remarks for documenting methods and classes

I strongly recommend using packages roxygen2 and devtools to document your methods and classes. You possibly might also want to look into package roxygen3.

Documenting generic methods with roxygen2:

#' Foo
#'
#' This method takes \code{x} and \code{y} and adds them.
#' 
#' Some details here
#' 
#' @param x \strong{Signature argument}.
#' @param y \strong{Signature argument}.
#' @param ... Further arguments to be passed to subsequent functions.
#' @param .ctx \strong{Signature argument}.
#'      Application context.
#' @param .ns \strong{Signature argument}.
#'      Application namespace. Usually used to distinguish different context 
#'      versions or configurations.
#' @author Janko Thyson \email{john.doe@@something.com}
#' @references \url{http://www.something.com/}
#' @example inst/examples/foo.R
#' @docType methods
#' @rdname foo-methods
#' @export

setGeneric(
    name="foo",
    signature=c("x", "y", ".ctx", ".ns"),
    def=function(x, y, ..., .ctx, .ns) {
        standardGeneric("foo")
    }
)

Documenting methods with roxygen2:

#' @param x \code{\link{character}}. Character vector.
#' @param y \code{\link{numeric}}. Numerical vector.  
#' @param .ctx \code{\link{ApplicationContextDepartmentC}}. 
#' @param .ns \code{\link{ProductionNamespace}}.  
#' @return \code{\link{data.frame}}. Some data frame.
#' @rdname foo-methods
#' @aliases foo,character,numeric,missing,missing-method
#' @export

setMethod(
    f="foo", 
    signature=signature(x="character", y="numeric", 
        .ctx="ApplicationContextDepartmentC", .ns="ProductionNamespace"), 
    definition=function(x, y, ..., .ctx, .ns) {
        data.frame(x, y)
    }
)

Hi i have method insertOrUpdateProductsToDB(Product product) is used to perform insert operation in database using catalogService of Broadleaf ,catalog Service is doing all saving operation in db . My method is expected restClient product as a parameter.After passing the restClient product we are converting this product into Broadleafproduct by using ProductConversion Class.In product conversion only setting is happening for converting rest Product into broadleafproduct. Now my requirement is to test this method using mockito but when i tried to do add these two line at the end of my test method

verify(mainProduct).getAdditionalSkus().add(sku);
       verify(mainProduct).setProductOptions(productOptionList);

Its failing.

when i debug the code there is for loop inside for loop in the method insertOrUpdateProductsToDB(Product product) and i find productOption = catalogService.saveProductOption(productOption); here productOption is coming null so please tell how to test loop inside loop and same happening for

for (Sku skuWithProductOptions : productConversion.createSkuWithProductOptions(product, mainProduct,productOptionList)) {
                    catalogService.saveSku(skuWithProductOptions);
                }

this line in the same method .kindly also check my test case whether i am doing right or not .

Class and insertOrUpdateProductsToDB(Product product) Method to be test

import com.admin.exception.AdminGenericException;
import com.admin.exception.AdminRestException;
import com.admin.util.helper.ProductConversion;
import com.admin.wrapper.getproducts.req.ObjectFactory;
import com.admin.wrapper.getproducts.resp.Product;
import com.admin.wrapper.getproducts.resp.Response;
import com.mycompany.rest.service.client.RestClientUtil;
import com.mycompany.util.constants.ApplicationConstants;

@Service
public class GetProductsServiceImpl {

    private static final Logger logger = Logger.getLogger(GetProductsServiceImpl.class);

    @Resource(name = "blCatalogService")
    protected CatalogService catalogService;

    public void setCatalogService(CatalogService catalogService) {
        this.catalogService = catalogService;
    }

    protected RestClientUtil restClientUtil;

    public void setRestClientUtil(RestClientUtil restClientUtil) {
        this.restClientUtil = restClientUtil;
    }

    @Value("#{configProperties['salePriceRate']}")
    private long salePriceRate;

    public void setRetailPriceRate(long retailPriceRate) {
        this.retailPriceRate = retailPriceRate;
    }

    @Value("#{configProperties['retailPriceRate']}")
    private long retailPriceRate;

    public void setSalePriceRate(long salePriceRate) {
        this.salePriceRate = salePriceRate;
    }


   //Insertion/Update DB logic
    public String insertOrUpdateProductsToDB(Product product) {
        logger.debug("Start of : insertOrUpdateProductsToDB()");
        try {
            List<String> category = new ArrayList<String>         (Arrays.asList(ApplicationConstants.CATEGORY));
            ProductConversion productConversion = new ProductConversion();
            List<ProductOption> productOptionList = new ArrayList<ProductOption>();
                if (category.contains(product.getCategory().toUpperCase())) {
                    org.broadleafcommerce.core.catalog.domain.Product mainProduct=catalogService.createProduct(new ProductType("org.broadleafcommerce.core.catalog.domain.Product", "Normal Product"));
                    mainProduct = productConversion.createProduct(mainProduct,product);
                    Sku sku=catalogService.createSku();
                    mainProduct.setDefaultSku(sku);
                    mainProduct = productConversion.addSkuToProduct(mainProduct, product, salePriceRate,retailPriceRate);
                    for (ProductOption productOption : productConversion.createProductOptions(product, mainProduct)) {
                        productOption.setAllowedValues(productConversion.createProductOptionValues(product,productOption));
                        productOption = catalogService.saveProductOption(productOption);
                        productOptionList.add(productOption);
                    }
                    sku = catalogService.saveSku(mainProduct.getDefaultSku());
                    mainProduct.getAdditionalSkus().add(sku);
                    mainProduct.setProductOptions(productOptionList);
                    mainProduct = catalogService.saveProduct(mainProduct);
                    for (Sku skuWithProductOptions : productConversion.createSkuWithProductOptions(product, mainProduct,productOptionList)) {
                        catalogService.saveSku(skuWithProductOptions);
                    }
                }
        logger.debug("End of : insertOrUpdateProductsToDB()");
        return "Product inserted into DB successfully";
       }
     catch (Exception e) {
        logger.error("Error:", e);
        return "Insertion of product into DB Failed ";
    }
  }
 //Insertion service for DB
  public String insertProductsIntoDB(){
         logger.debug("Start of : insertProductsIntoDB()");
         int insertionCount=0;
           try{
              com.admin.wrapper.getproducts.resp.Response resp = getAvailableProductsFromPBS();
              for (Product product : resp.getProducts().getProduct()) {
                    if(catalogService.findProductById(Long.parseLong(product.getId()))==null){
                        String str=insertOrUpdateProductsToDB(product);
                        if(str.equalsIgnoreCase("Product inserted into DB successfully")){
                            insertionCount=insertionCount+1;
                        }
                       }
                    }
              logger.debug(insertionCount+" Products inserted into DB successfully");
              logger.debug("End of : insertProductsIntoDB()");
              return insertionCount+" Products inserted into DB successfully";
           }catch (AdminRestException e) {
                logger.error("Error:", e);
                return e.getMessage();
           }
    }

}

My test case class and method

public class GetProductsServiceImplTest {
    private CatalogService catalogService;
    private RestClientUtil restClientUtil;
    private GetProductsServiceImpl getProductsServiceImpl;
    private org.broadleafcommerce.core.catalog.domain.Product mainProduct;
    private Sku sku;
    private  ProductOption productOption;
    private List<ProductOption> productOptionList;


    @Before
    public void setUp() throws Exception {
        catalogService = mock(CatalogService.class);
        productOptionList=mock(List.class);
        mainProduct = spy(new ProductImpl()); 
        sku =  new SkuImpl();
        getProductsServiceImpl = new GetProductsServiceImpl();
        getProductsServiceImpl.setCatalogService(catalogService);
        productOption=mock(ProductOption.class);
        restClientUtil = new RestClientUtil();


    }

    @Test
    public void testInsertOrUpdateProductsToDB() {

    restClientUtil.setSellerCode("1");
    restClientUtil.setPbsUrl("http://10.52.165.239:8080/pbs");
    getProductsServiceImpl.setRestClientUtil(restClientUtil);
    Response pbsResponse = getProductsServiceImpl
        .getAvailableProductsFromPBS();
        for (Product pbsProduct : pbsResponse.getProducts().getProduct()) {
            when(catalogService.createProduct(new ProductType("org.broadleafcommerce.core.catalog.domain.Product","Normal Product"))).thenReturn(mainProduct);
            when(catalogService.createSku()).thenReturn(sku);
            when(catalogService.saveProductOption(productOption)).thenReturn(productOption);
            when(catalogService.saveSku(sku)).thenReturn(sku);
            when(catalogService.saveProduct(mainProduct)).thenReturn(mainProduct);
            when(catalogService.saveSku(sku)).thenReturn(sku);
            getProductsServiceImpl.insertOrUpdateProductsToDB(pbsProduct);
            verify(mainProduct,times(2)).setDefaultSku(sku);
           verify(mainProduct).getAdditionalSkus().add(sku);
           verify(mainProduct).setProductOptions(productOptionList);

            break;          
        }
        }
}

This is the error while testing

   java.lang.NullPointerException
    at com.admin.api.service.getproducts.test.GetProductsServiceImplTest.testInsertOrUpdateProductsToDB(GetProductsServiceImplTest.java:68)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)

I have a few remarks that probably won't answer your orignal question. But I hope they will guide you toward a better refactor of this code. Also the code sample you showed are not enough to point you at the exact issue ; it's an NPE in the test method so it should not be that difficult to track down.

That being said here's the point I'd like to raise

  • The test code is curiously crafted, and in my opinion this code are overusing Mockito. Overall this code looks way too complex to be correctly tested anyway. I don't think it was coded following TDD principle (TDD is really convenient when it comes to testing and designing the app)

  • You may want to follow the common guideline no more than 10 line of codes in a single method, this usually helps to separate concerns and identify simpler code / intents. These simpler code could be changed and tested more easily if designed correctly (without leaking concepts or variables). For example you may want to extract a method that saves a single Product and test only that one.

  • What's even more striking is that this code seems kinda procedural (even if inside objects). And doesn't really explain the intent in business words (ok it's about saving stuff in DB, but for which reason there's all this logic, this reason should appear in the method name).

  • The test and Mockito is weird, and the code should not iterate over the collection to stub then verify

    for (Product pbsProduct : pbsResponse.getProducts().getProduct()) {
        when(catalogService.createProduct(new ProductType("org.broadleafcommerce.core.catalog.domain.Product","Normal Product"))).thenReturn(mainProduct);
        when(catalogService.createSku()).thenReturn(sku);
        when(catalogService.saveProductOption(productOption)).thenReturn(productOption);
        when(catalogService.saveSku(sku)).thenReturn(sku);
        when(catalogService.saveProduct(mainProduct)).thenReturn(mainProduct);
        when(catalogService.saveSku(sku)).thenReturn(sku);
        getProductsServiceImpl.insertOrUpdateProductsToDB(pbsProduct);
        verify(mainProduct,times(2)).setDefaultSku(sku);
        verify(mainProduct).getAdditionalSkus().add(sku);
        verify(mainProduct).setProductOptions(productOptionList);
    
        break;
    }
    
  • In pseudo code I would first try to extract the saving logic using the given/*when*/then BBDD keywords (they help to clarify what need to be tested in which scenario and context). Keep the fixture and assertions to a minimum, you would rather deal with multiple test method than multiple complex test methods.

    @Test
    public void ensure_product_is_saved_in_the_catalog() {
        // given
        Product a_simple_product = ProductBuilder.simpleProduct().build();
        when(catalogService.doSomething(....))).thenReturn(mainProduct);
    
        // when 
        productsService.saveProduct(product);
    
        // then
        verify(catalogService).doSomethingElseWith(mainProduct);
    }
    

If assertion on product data is relevant in your test scenario, then write a test that actually test the data (using JUnit assertions, AssertJ, ...). Don't mock the Product !

And proceed gradually for each test, then refactor if need ed to keep the code manageable (extract a single method in another class if necessary, etc.)

Hope that helps.

I am preparing a presentation. My topic is innovative software engineering methodologies. Agile is modern and innovative methodology but is the answer just Agile? What are the other innovative and modern methodologies? Are test driven development and behavior driven development also innovative methodologies? And eXtreme Programming is traditional methodology like waterfall?

I am not sure we can categorize these methodologies or frameworks as innovative, traditional or something else.

The choosing for a methodology or framework is completely depend on the product and customer needs. Any of them that meet the product requirements and provide efficiency to your team can be innovative in that scope.

Most of the software development process is developing complex products in complex environments in today's development World. I totally agree that agile methodologies, extreme programming, TDD and BDD are matches very well to the before definition of developing complex products in complex environment. Therefore, most of the agile methodologies are inspection of developing complex products.

Agile methodologies

The term agile is a really popular term used by software development professionals. There are lots of agile methodologies, frameworks like scrum, kanban or XP. They suggest methods to use to make us agile. The term agile is all covers these methods. Most of them solves the prediction, adaptation, transparency, inspection and empirical processes. All agile methodologies try to solve these problems faces during software development.

Extreme Programming

Extreme programming focuses on developing qualified software products and adopting to changing requirements and environment. Honestly, I really like XP. It does not suggest only development methodologies. It also suggests some about customer management, cost management etc. It is really basic, but hard to implement. I highly suggest to read the book Extreme Programming Explained by Kent Beck.

See Also :

Extreme Programming

Extreme Programming Explained, by Kent Beck

Scrum

Scrum is another framework for software development based on empirical process control: transparency, inspection, and adaptation. It is really simple and defines some role and event during software development. The roles are Scrum Master, Product Owner and Development team. The events are Sprint Planing, Daily Scrum, Sprint review and Sprint Retrospective. I suggest to read Scrum guide for further information.

See Also

Scrum Guide

Test Driven Development

Test driven development is a software development process. I can not say that it is an agile methodology itself. It helps software development to be agile. Test driven development support developers to test in the first phase. Test driven development also requires a mind set to think test before every development. It is not only writing unit test.

See also

Test-driven development

Test-driven development by Martin Fowler

Test Driven Development: By Example, Kent Beck

Behavior-driven development

It is another software development process and emerged from test driven development. It focuses on the cross team like development, management and customer have shared tools and shared processes to have the same understanding of the requirements. BDD suggest that business people, customer and technical teams should have same understanding for the product. Customer requirements, lets sat customer sentences, can be automatically tested by the tools.

See also

Behavior-driven development

10 Tips for Writing Good User Stories

Cucumber.io

Summary

The term Agile itself is missing without XP, Scrum, Kanban or any other methodology or framework. Any agile methodology or framework is also missing without TDD, BDD or Continuous integration. Any of these items must be supported by company culture, customer or business people. Every stakeholder in the project should have the mindset for product over project. Otherwise, Agile methodologies may not be helpful.

As a last word, I highly suggest to get well understanding of continuous integration.

See also

Continious Integration

Products over projects

The Clean Coder: A Code of Conduct for Professional Programmers

What are the countermeasures? ( Rearchitecting / reimplementing the system is not an option if the product is in production at many (big) customers: risk is too high. )

What is the evolutionary approach that has low risk and can be explained to management ?

As an addendum to Rob's Conklin nice answer:

I'm guessing that your system has poor (or none) set of automated test. I'd recommend the following approach (in small vertical steps, as explained below):

1) Write Smoke Tests and Integration Tests for typical cases and crucial scenarios in your system.

2) Redesign the system identifying new components, following SOLID principles.

3) Write new components using TDD, possibly reusing inside them some of existing logic, but transparently.

4) Try to implant them in your system, replacing the old components.

Important note - doing all this at once for every part of the system is usually not a good idea. Small refactoring steps are recommended. If a module is due to change then it's a good opportunity to partly or fully reimplement it, transparently to the rest of the system. Good usage of proxy, facade and wrapper/adapter patterns are usually most useful in partial refactoring of large systems. Trying to refactor/redesign/rebuild/fix whole system at once is usually a disaster :)

Another important thing, encouraged in SCRUM training materials and for example Robert Martin's Clean Coder book: you don't have to explain yourself from refactoring - it's part of the job. Task should be considered "done" if:

  • it has unit tests
  • it works
  • it was followed by local refactoring

It should be a part of normal, professional process. Bear in mind, that even your new shiny architecture will be subjected to changing requirements and constraints and can also become obsolete in time. That's why the "refactoring" part should be an ordinary part of the process.