As discussed in my previous post, Spring Integration (SI) is a routing framework built on top of the Spring Framework that allows you to use proven enterprise integration patterns to solve system integration problems via messaging. Once you’ve gotten SI configured and working to perform your routing and mediation logic, you may find that you’d like to take the next step and add more robustness to your solution. You may wish to distribute some of your routing, mediation, or service logic across multiple hosts, you may wish to add some reliability to the messages transmitted through your SI channels, and you may wish to scale out your services more than you could with a traditional client-server architecture. Well, one way to achieve some of the goals mentioned above is to use a message broker to back your SI routes. SI provides abstractions for both AMQP brokers and JMS brokers. In this post, I’d like to use the Cafe sample from the Spring Integration Samples project to illustrate how to use the popular ActiveMQ message broker to back your SI routes with JMS.

JMS is a good way to integrate your existing java solutions with messaging. As the JMS spec is an API, you can take full advantage of relying on the interfaces to the broker regardless of what broker you’re using. You could use ActiveMQ, WebSphere MQ, or any other JMS-compliant message broker. I chose ActiveMQ for this example because of its maturity, robustness, ubiquity in industry, as well as it’s open source from the Apache Software Foundation with an Apache license. It fully implements JMS 1.1, provides high availability, and can scale horizontally through a network of brokers. If you’re integrating java applications, stick to JMS. ActiveMQ also provides bindings for C++, C#, Ruby, Python, Erlang, and many others (see their website for the full list)

Note, AMQP is a viable alternative too. AMQP specifies a wire-level protocol that allows messaging systems built on different platforms and/or heterogeneous languages to interoperate with each other (not just java/JVM, which can use the JMS API). The Cafe demo already has an implementation of AMQP for use with Spring’s RabbitMQ server (a popular open-source AMQP broker that’s part of the Spring portfolio).

For more information on the differences between AMQP and JMS, including how they work, the different terminologies used in each, and brief histories of the two, see this great PDF essay written by Mark Richards (one of the authors of the Java Message Service book, from O’Reilly)

The code associated with this post can be found at my forked version of the Spring Integration Samples project at github.com . Check out the /applications/cafe maven module for my code.

Backing your channels with point-to-point or publish-subscribe JMS destinations

In my example, I opted to use an embedded broker. Since ActiveMQ is a pure java solution, you can embed the broker in a java application and use it internally as well as allow external clients to connect and participate in the messaging. Doing so does not limit your ability to configure ActiveMQ in any way. It can be easier to deploy a full integration solution with its own embedded broker rather than rely on an external instance being set up (by another group?) or configured externally.

All of the spring configuration files for the ActiveMQ-based solution can be found in the META-INF/spring/integration/activemq classpath under /src/main/resources.

The files that relate to backing the SI channels with JMS destinations are cafeDemo-amq-config.xml and cafeDemo-amq-jms-backed.xml. The cafeDemo-amq-config.xml file is responsible for configuring the connection to the ActiveMQ broker. The name of the connection factory, in this case “connectionFactory”, is significant because SI will by default look for a bean of that name to configure the destinations later used by the JMS-backed channel.

The cafeDemo-amq-jms-backed.xml file looks very similar to the non-broker implementation of the cafe sample (cafeDemo-xml.xml) except that the channels have been converted to the JMS-backed versions and that the ActiveMQ broker is embedded with the rest of the configuration. Note that the method used for embedding the broker allows for complete configuration right within the spring file For this example, there is no dependency on an externally running broker. The configuration for this small example sets up only one transport connector (at the default port, 61616… we could have used the vm:// transport, but I wanted to show an example using TCP) and does not configure broker security, destination policies, etc. It does, however, take advantage of the out-of-the-box configuration details, including the JMX management MBeans, as well as message persistence via the recommended and highly optimized KahaDB. See the ActiveMQ documentation for more.

The channels used for the “coldDrinks” and “hotDrinks” were set up as polling channels in the original configuration. To accomplish that with JMS destinations, set the “message-driven” attribute on the channel to “false.” In this case we didn’t need to declare the destination names ahead of time, but if you’d like to add extra security and authorization properties around the destinations, you may wish to create them ahead of time either on the broker or from the SI configuration. The main class for running this sample is org.springframework.integration.samples.cafe.xml.CafeDemoActiveMQBackedChannels.

The best way to observe that ActiveMQ is indeed being used is to run the sample and use JConsole to review the MBeans in the JMX server. From JConsole, you can see indeed the messages are being enqueued and dequeued through the queues and/or consumed from the topics. To test robustness gained by using ActiveMQ, try running the sample and abort it half-way through. Then comment out the line in the main file that adds orders to the system and restart the sample. It will continue to process where it left off when abnormally terminated. And there you have reliability and recovery just by changing a few lines of configuration for the channels.

What about running different parts of your routes on different servers or at least outside of the same JVM?

This allows you to add more instances of a particular part of the route to improve throughput and scalability without making any code changes (among other advantages). Just hook up more consumers to a queue/topic. Both concepts are available within the SI process (using just SI channels) as well as outside of its process (with JMS).

To demonstrate that, we’ll use the JMS inbound/outbound gateways and/or channel adapters provided by SI. With the JMS gateway, we can achieve a request-reply message exchange while the channel-adapters allow us to just fire and forget using asyncronous semantics.

The example is set up the same way the AMQP sample is set up and it also relies on an externally running broker (although we could have embedded it as above). Start by running the consumers (CafeDemoAppBaristaColdActiveMQ, CafeDemoAppBaristaHotActiveMQ) that listen for cold or hot drink orders. Next, start up the flow that’s responsible for the main flow and orchestration (CafeDemoAppOperationsActiveMQ). This orchestration flow handles taking orders, splitting them, routing them to the appropriate services (the cold and hot drink Baristas from above) and then handling responses and aggregating them to be delivered by the waiter. In here you’ll see the JMS gateways set up appropriately. Finally, you’ll need to run the process that actually initiates the orders by sending them to an order queue (CafeDemoAppActiveMQ).

All four of these processes are run independently of each other and could run on separate machines if necessary. They have their own application contexts and are only visible to each other by the ActiveMQ message broker. This is a highly modular and decoupled solution that uses a message broker for reliable communication. The broker, as mentioned above, can be configured for high availability so it’s not a point of failure.

Advantages of this type of architecture:

  • message reliability – the message broker stores and forwards messages. messages will be delivered at most one time. if the broker goes down, previously undelivered messages will persist and can be resent if the consumers didn’t get them
  • flexibility – with the components decoupled and relying on EIP, you can maintain each one independently of each other, including deployment, enhancements, etc
  • throttle or increase message processing – with components running in their own/separate processes or boxes or parts of the world, you can configure each component to consume or throttle messages depending on how much the environment can handle
  • scaling – to handle a higher throughput, just add more instances of a component to listen on a JMS destination

Disadvantages:

  • complexity – maintaining multiple components is more complicated that packing things into one process
  • debugging – along with increased complexity comes difficulty debugging. async processes are inherently difficult to debug

Take a look at the Spring Integration samples from my github repo. The application context files used to configure the ActiveMQ connectivity are all thoroughly documented.

As the Spring Integration project slowly gains more adoption and interest, developers in the enterprise integration or enterprise development space will most likely come across it. They may find it interesting, but not fully understand what it is, what problems it was created to solve, where they can get more information, and where it fits within the open-source ecosystem of ESBs and other SOA infrastructure. Here’s my attempt at a normal-person’s description of what it is.

Up first, what is it?
It’s an open-source project commissioned by SpringSource to leverage the current capabilities of the Spring Framework to focus on the problems found in the application-integration space. Without a concrete example, or a more fundamental understanding, that last sentence is probably just as vague as the other information you may have seen about Spring Integration or integration in general. So let me go into a little more detail to help make that statement a little less vague.

So why did the Spring folks decide to create a project specifically focused on integration? Doesn’t the Spring Framework itself provide a lot of that already? Spring does have fantastic abstractions for dealing with JMS, JDBC, transaction management, object-xml mapping, http/rmi invocation, and many others. It also provides a dependency-injection based framework which promotes cleaner, decoupled, and easier to test code. But if you take a step back and realize what Spring provides it really is just general purpose building blocks, a component model, that can be used in a limitless variety of solutions. So when it comes to system/application integration, you could implement your own, very capable, solution using these building blocks. However, application integration, and the problems inherent in solving those problems are not new. There are quite a few “patterns” that emerge once you’ve experienced a handful of attempts at integrating two systems for data exchange, process invocation, event notification, etc. These patterns were very well captured by Gregor Hohpe and Bobby Woolf in their timeless book “Enterprise Integration Patterns.” As I’m sure I’ve mentioned in previous blog posts, I highly recommend this book for anyone in the enterprise development space. These patterns are well known and proven for addressing most integration problems.

The Spring folks decided to take the building blocks of their Spring Framework and the patterns presented in Hohpe’s book to create a more-focused framework that specifically deals with integrating applications.

So what are the problems in the integration space? Like I said, they are described far better than I can in EIP, but here’s a simple description of a problem that is almost always present. Two applications need to share a piece of data, for example, a customer report which is originated on system A needs to be available in another system B. System A can only communicate with outside applications through a direct TCP connection and system B has a simple web-service for loading report information to it and is unwilling to change to anything else. How do you go about doing this? You could write some custom integration “glue code” that runs periodically: set up a batch process or a cron job, break out Java’s socket, or socket-nio libraries, connect up to system A, read and write to an input stream, grab the useful data, convert it to some kind of intermediary format, map some of that data to a SOAP xml message so that system B can understand it, break out AXIS or HttpCommons and send the xml to system B. A lot of the coding involved in creating this integration can be classified as infrastructure and not really “custom.” For example, connecting to TCP and reading/writing streams. Why should we have to write that code? There’s nothing custom about that. Delegate that to a framework/library. What about the polling to see whether an application is available? Delegate that too, that’s not a custom concern. And the webservice call? Generic components for TCP communication, polling or event handling, Web Service calls, routing and transforming, and many others. is exactly what Spring Integration provides. And it tries to mimic the full power of the patterns described in the EIP book while using a component model familiar to previous Spring Framework users.

One question that I find comes up a lot in discussions about “what is Spring Integration” is how does it relate to an ESB or SOA architecture, and if one were to do an analysis about Spring Integration vs its competition, what specifically is its competition? First, Spring Integration is NOT an ESB. It is a *routing* and *mediation* framework. When i say it’s a mediation framework, I mean that it allows two different systems with different messages and protocols to talk with each other by “mediating” the message: resolve/negotiate differences between the two so they can exchange data. This mediation and routing framework can be used anywhere and doesn’t need to be deployed into a heavyweight ESB container, or any ESB container. It can be deployed within in application (either stand alone application or part of a Java EE solution within an application server), within an ESB if you need, as part of a message broker, etc. It’s flexible regarding deployment. Spring Integration itself should not be compared to ServiceMix, MuleESB, TIBCO, IBM or Oracle’s ESB solutions or other ESBs. One open-source project that comes to mind that would be a fair comparison is Apache’s Camel project which too is a mediation and routing engine. Apache Camel is also a very powerful and highly capable solution to the integration problem space and it also implements the patterns from the EIP book. I can do a comparison in a future blog post if readers show interest.

For more information about Spring Integration, I recommend visiting their project page , taking a look at a recent book Pro Spring Integration, and of course reading and fully understanding the EIP book

Over the weekend I was debugging a peculiar issue with ActiveMQ (turned out to indeed be a bug that had been reported a few months ago: https://issues.apache.org/jira/browse/AMQ-3359 ) and I became curious about how the components were being loaded up, wired together, and eventually started by the infrastructure code. If there is any interest in that, and I have time, I may blog about that later, but I stumbled upon a pretty cool way of creating your own custom namespaces for spring application context configuration files. Well, I stumbled upon how it’s done in ActiveMQ. I didn’t see much documentation online about how they did it, so I loaded up the remote debugger and figured it out.

Spring XML configuration can be quite verbose if you’re using the basic constructs of beans and properties. Solutions such as autowiring to try to keep the config files smaller just add confusion especially if you’re new to a project that uses a lot of autowiring. A better solution is to use namespaces that allow you to create your own XML elements and keep the syntax more concise and clear.

The Spring documentation has always been a valuable part of using the Spring Framework, and adding custom namespaces and hooking it into the config files is clearly explained . See the appendix of the online documentation to see the details, but basically you follow some conventions, create a schema for your new elements, create your own namespace handler and bean definition parsers. It’s not difficult, but it’s more manual steps and parsing classes than you should have to create.

That is where xbean-spring comes into play.

The xbean-spring project (and its accompanying ant and maven plugins) allows you to create your own namespaces for spring configuration files while taking care of all of the boilerplate steps for you. It allows you to map a set of pojo java beans, using annotations, to the your custom spring config elements and handles creating the XSD, hooking into the namespace handlers, and parsing the bean with a bean definition parser. In other words, all you do is annotate your existing java beans and let xbean-spring take care of the rest. This makes creating your own custom xml configuration elements much easier, and as previously mentioned, using custom xml configuration namespaces allows your configuration to be much more concise and readable.

Add the following dependency to your pom.xml for xbean-spring:

        <dependency>
            <groupId>org.apache.xbean</groupId>
            <artifactId>xbean-spring</artifactId>
            <version>${xbean-version}</version>
        </dependency>

First thing you need to do to enable xbean-spring is annotate your classes with the xbean-spring annotations. The project relies on comment/JavaDoc style annotations and uses the awesome QDox project to parse out these annotations (not necessary to know, but interesting none-the-less).

Here’s an example spring application context with a custom namespace and config elements:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
       http://christianposta.com/schema/core http://christianposta.com/schema/core/core.xsd">

    <simple xmlns="http://christianposta.com/schema/core" myProperty="testMe">

        <simpleController>
            <simpleController controllerName="testMeToo" />
        </simpleController>

        <controllers>
            <complexController pattern="testPattern" />
            <complexController pattern="testPattern2" />
            <complexController pattern="testPattern3" />
        </controllers>

    </simple>
</beans>

Enabling this custom namespace requires three pretty straight forward and easy things none of which require writing custom XSDs or parsers.

  1. Annotate the pojos you want to map to these custom elements
  2. Add the maven xbean-spring plugin to your pom.xml file, configure it
  3. Create your spring config file and use the xbean-spring ApplicationContext subclasses

Here are the steps for creating the above spring config (source code is included), or as a git repo at github.com.

1) Annotate the pojos

For this sample, I’ve put together three pojos:
com.christianposta.postaprojects.xbeanspring.SimpleBean, which will represent the top-level element (<simple />). The code is below, but the only thing required for the xbean-spring integration is the @org.apache.xbean.XBean annotation in the comments. And since this is the root element, there is another prop to set. See the code:


/**
 * @org.apache.xbean.XBean element="simple" rootElement="true"
 */
public class SimpleBean {

    private String myProperty;
    private SimpleController simpleController;
    private List<ComplexController> controllers;

    public String getMyProperty() {
        return myProperty;
    }

    public void setMyProperty(String myProperty) {
        this.myProperty = myProperty;
    }

    public SimpleController getSimpleController() {
        return simpleController;
    }

    public void setSimpleController(SimpleController simpleController) {
        this.simpleController = simpleController;
    }

    public List<ComplexController> getControllers() {
        return controllers;
    }

    /**
     * @org.apache.xbean.Property alias="controllers" nestedType="com.christianposta.postaprojects.xbeanspring.ComplexController"
     */
    public void setControllers(List<ComplexController> controllers) {
        this.controllers = controllers;
    }
}

There are three javabean properties in this class, each demonstrating a different type of configuration. The myProperty field maps to the “myProperty” attribute of the simple element: <simple myProperty=”value” />. The simpleController field maps to a complex type which is a child of the <simple> element. For xbean-spring to know this, and map it properly, add the @org.apache.xbean.XBean annotation to the com.christianposta.postaprojects.xbeanspring.SimpleController class. Then xbean-spring will figure it out automatically and know that the <simpleController> element maps to that bean:

/**
 * @org.apache.xbean.XBean
 */
public class SimpleController {

    private String controllerName;

    public String getControllerName() {
        return controllerName;
    }

    public void setControllerName(String controllerName) {
        this.controllerName = controllerName;
    }
}


The last property, controllers, is a little more complex, but still very straight forward. Above the setter/mutator for that property, is another xbean annotation that specifies to what type to map the java.util.List. The @org.apache.xbean.Property annotation along with the nestedType attribute let xbean-spring figure out how to map the elements from <controllers><complexController/></controllers> to the com.christianposta.postaprojects.xbeanspring.ComplexController class. That class also needs the @org.apache.xbean.XBean annotation to participate:

/**
 * Sample ComplexController bean
 * @org.apache.xbean.XBean
 */
public class ComplexController {
    private String pattern;

    public String getPattern() {
        return pattern;
    }

    public void setPattern(String pattern) {
        this.pattern = pattern;
    }
}

After figuring out how you want the structure of your custom namespace to look, including all sub-elements and attributes, and after applying all the annotations, then you’re done with step #1. No code, no XSD, no parsers. Easy.

2) Add the xbean-spring maven plugin to pom.xml

Upon running maven to compile your code, you need the xbean-spring plugin to do the magic of binding the xbean annotations to all of the spring boilerplate code (building the XSD, the spring.handlers, and the spring.schemas files). Add the following plugin declaration to your pom.xml:

            <plugin>
                <groupId>org.apache.xbean</groupId>
                <artifactId>maven-xbean-plugin</artifactId>
                <version>${xbean-version}</version>
                <executions>
                    <execution>
                        <phase>process-classes</phase>
                        <configuration>
                            <namespace>http://christianposta.com/schema/core</namespace>
                            <schema>${basedir}/target/classes/core.xsd</schema>
                            <outputDir>${basedir}/target/classes</outputDir>
                            <generateSpringSchemasFile>true</generateSpringSchemasFile>
                            <strictXsdOrder>false</strictXsdOrder>
                        </configuration>
                        <goals>
                            <goal>mapping</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>


An explanation of the config options for the plugin:
namespace: this is the default namespace that’s used for the XSD and the elements from it. It should match what you declare in your spring config file for your new namespace

schema: this is the location and name to put the generated schema

outputDir: this is where the xbean-spring and spring boilerplate properties files will go (under META-INF mostly, but some others under the /classes root)

generateSpringSchemasFile: whether to generate spring.schemas file. optional if you want to supply your own

strictXsdOrder: available from xbean-spring 3.9 and above, allows the namespace elements to be in any order or strict order.

After running any of the maven lifecyles that runs the “compile” lifecycle (e.g., package, install), you’ll see in the target/classes/ folder all of the artifacts produced by xbean-spring that hook into the spring namespace system for building custom namespaces. You’ll see the XSD is generated and located wherever you specified in the schema property of the plugin. You’ll also notice that target/classes/META-INF also contains the spring.handlers and spring.schemas properties files that are required by spring. These were generated by the plugin.

3) Create your applicationContext file and load up the Spring Context

See the applicationContext from above. You also need to use the xbean-spring aware ApplicationContext classes. There are four of them, and they map to the ApplicationContext classes found within spring, but with added support for xbean-spring:


org.apache.xbean.spring.context.ClassPathXmlApplicationContext
org.apache.xbean.spring.context.FileSystemXmlApplicationContext
org.apache.xbean.spring.context.ResourceXmlApplicationContext
org.apache.xbean.spring.context.XmlWebApplicationContext

Like I said, they’re exactly the same as the ones that come with spring, but with added support for xbean. Note, the only one that doesn’t have a counterpart in the core spring is ResourceXmlApplicationContext.

See this unit test from the accompanying source code:

    @Test
    public void testBeanGetsCreated() {

        // Got to use the XBean version of the Application Context
        ApplicationContext context = new ClassPathXmlApplicationContext("/applicationContext-test.xml");
        assertEquals(1, context.getBeansOfType(SimpleBean.class).size());

        SimpleBean bean = context.getBean(SimpleBean.class);
        assertEquals("testMe", bean.getMyProperty());

        SimpleController controller = bean.getSimpleController();
        assertEquals("testMeToo", controller.getControllerName());

        List<ComplexController> controllers = bean.getControllers();
        assertEquals(3, controllers.size());

    }

That’s it! Feel free to look at the source code for the xbean-spring project: http://svn.apache.org/repos/asf/geronimo/xbean/trunk/xbean-spring/

Please feel free to comment if you know of more options and more advanced xbean mapping. I realize there are more complicated mappings, so please advise!

Although the Active MQ website already gives a pithy, to-the-point explanation of ActiveMQ, I would like to add some more context to their definition.

From the ActiveMQ project’s website:

“ActiveMQ is an open sourced implementation of JMS 1.1 as part of the J2EE 1.4 specification.”

Here’s my take: ActiveMQ is an open-source, messaging software which can serve as the backbone for an architecture of distributed applications built upon messaging. The creators of ActiveMQ were driven to create this open-source project for two main reasons:

  1. The available existing solutions at the time were proprietary/very expensive
  2. Developers with the Apache Software Foundation were working on a fully J2EE compliant application server (Geronimo) and they needed a JMS solution that had a license compatible with Apache’s licensing.

Since its inception, ActiveMQ has turned into a strong competitor of the commercial alternatives, such as WebSphereMQ, EMS/TIBCO and SonicMQ and is deployed in production in some of the top companies in industries ranging from financial services to retail.

Using messaging as an integration or communication style leads to many benefits such as:

  1. Allowing applications built with different languages and on different operating systems to integrate with each other
  2. Location transparency – client applications don’t need to know where the service applications are located
  3. Reliable communication – the producers/consumers of messages don’t have to be available at the same time, or certain segments along the route of the message can go down and come back up without impacting the message getting to the service/consumer
  4. Scaling – can scale horizontally by adding more services that can handle the messages if too many messages are arriving
  5. Asynchronous communication – a client can fire a message and continue other processing instead of blocking until the service has sent a response; it can handle the response message only when the message is ready
  6. Reduced coupling – the assumptions made by the clients and services are greatly reduced as a result of the previous 5 benefits. A service can change details about itself, including its location, protocol, and availability, without affecting or disrupting the client.

Please see Gregor Hohpe’s description about messaging or the book he and Bobby Woolf wrote about messaging-based enterprise application integration.

There are other advantages as well (hopefully someone can add other benefits or drawbacks in the comments), and ActiveMQ is a free, open-source software that can facilitate delivering those advantages and has proven to be highly reliable and scalable in production environments.

I recently stumbled upon an essay that I’ve read many times in the past.  I didn’t have it bookmarked on this newer laptop, so after doing so I also decided to share with you. It was published originally back in 1992 in the magazine C++ Journal. The author, Jack Reeves, puts forth an assumption that is believed to be true and then discusses the impact of that assumption on the way we look at software development and design. His assumption: The source code of the software is the actual design; just like traditional engineers produce design documents, the artifact of designing software is the source code. Traditional engineers then give their design documents to construction/manufacturing who then create a product. The analog to that in software would be handing the source code over to the compilers, linkers, and build system to produce a product.

The consequences he discusses as a result of this assumption really help to illuminate why there are pain points in the software development process in organizations that practice the traditional waterfall process to software engineering.

Here is the link to the article:

What is Software Design?

Enterprise architects seem to become more and more involved in “trying out new things” or pushing down technology or implementation advice — nay dictation — without having a dog in the fight, or having to code any part of it. I’ve observed this in quite a few place, both working with the architects as a fellow architect, or as a developer. From these observations, I’ve come up with three rules for myself for being a good enterprise architect that I believe may be worthy for sharing and discussion.

#1 Gain the respect of the developers

I would like to generalize and say that developers seem to be the type of people who don’t want to put up with more bullshit than they absolutely have to. So trying the typical political maneuvering that you find in big companies to impress developers wont work. That includes salesmanship, power point presentations, etc. Those skills can be important for relaying a direction or vision, but it’s not going to impress the developers. The most tried and true way to gain their respect is to code with them. Yes, indeed. Good architects code. Bad ones pontificate. And there seem to be *way* more of the latter than the former. Coding your brilliantly “architected” solution will help gain their respect. But it also helps in another area. The second rule I follow.

#2 Realize that you cannot design a system on paper.

The source code is not the product that you’re engineering. The source code itself is the design. So when I sit in an architecture role, I remind myself that coming up with diagrams and flow visualizations is not the design. It’s a brainstorm to help develop a model in my head. But without putting that model to code, you don’t know how it will truly behave, or how the architected solution should be altered. And believe me. In almost all cases, it should be altered. In other words, there should be a feedback loop between the developers and the architects. And if you follow rule #1, you’ll be right there to observe first hand how your solution plays out in code.

#3 Don’t resume build

Don’t glom onto the latest and most shiny technology and push it onto the developers without putting it through some rigorous, real-life situations. Playing with new technology is fun. I do it all the time. But I do it outside of my day job. Sacrificing the stability of the team, the software, and the business model just because some technology seems cool and Google might hire you if you know it is not a respectable way to go about solving enterprise problems. Even if you’ve seen enough sales presentations about how this new technology is going to be such a magic bullet, resist the temptation to try to indoctrinate the rest of the team until you’ve put the new technology to real life software problems in an incubator.

I’ve been on both sides of the fence, have worked with a bunch of good developers and architects, and these are my three rules. Anyone want to add anything?

I stumbled upon a discussion, probably an age-old discussion, about whether java passes function arguments by reference or by value. i know I’ve studied this in the past and I know the answer is not a crystal clear one unless you consider C++’s accepted definitions of pass by reference. In that case, java is always pass by value.

Java does not allow passing arguments by reference, wherein the reference is an alias for the actual object/variable. Java passes pointers by value (copies the pointer) to a function which can then dereference the pointer and manipulate the object. However, just because the pointer allows access to the object and allows mutations of the object doesn’t mean you have a reference to the object.

for example, x = 0 would look something like this in memory:

----------------------------
0x00000001   |      0       |
----------------------------

Where x is the location in memory that holds the value ’0′. x is in fact 0, and the place where x is stored is 0×0000001. The reference to x is 0×0000001. x is not a pointer. there are no pointers in this example.

A pointer would be something like this:

x = 0
*p = x

----------------------------
0x00000001   |      0       |  <-- this is x
----------------------------
0x00000002   |   0x00000001 |  <-- this is p
----------------------------

In java, when you make a function call, the stack frame that's set up for the function call will contain the (values - copy) of primitive values or the pointer. For example..

int a = 21;
Object b = new Object();

Let's say these are represented in memory like this:

----------------------------
0x00000001   |      21      |  <-- this is a
----------------------------
0x00000002   |  0x0000fff1  |  <-- this is b
----------------------------
			 .
			 .
			 .
-------------------------------------------
0x0000fff1   |  begin details of object b  |  <-- this is the heap somewhere with object's contents
-------------------------------------------

A method call like foo(a, b) will result in copying the values of 21 and 0x0000fff1 onto the call stack:

void foo(int x, Object y)

----------------------------
0x00000032   |      0       |  <-- current stack pointer
----------------------------
0x00000033   |      0       |  <-- return value (for illustration only)
----------------------------
0x00000034   |      21      |  <-- this is x
----------------------------
0x00000035   |  0x0000fff1  |  <-- this is y
----------------------------

As you can see, the values of a and b are copied onto the stack frame as x and y. x has the value of 21, y has a pointer to the same object pointed to by b. You can manipluate y (which will manipluate the contents of the object), but assigning y to some new object will not reassign b to a new object. This is because you would only be assigning the value of y to be a different pointer. You wouldn't be assigning the value of b (which points to the object). This is the distinction between pass by value and pass by reference. y is not an actual alias of b.

If it were, I could completely reassign y to be something and when the code returned to where it was called, b would have the same value as y. But it won't. You cannot change the *value* of b (what b points to) by changing y.

Recently, I needed an way to report the version of the application I’m working on to the users of the application.

I wanted to avoid manually updating a settings file or some similar configuration file that the application could read and runtime and report the version. This would be a manual step that would eventually be forgotten at some point, or overlooked.

I tag each of my deployments in SVN, so I figured there must be some way to associate the tagname (which represented the version) to the version that’s displayed to the users. When you run the ‘svn info’ command, it does display the working directory’s URL path to the svn repo. I decided to ask around for suggestions, as I didn’t necessarily want to roll new code to parse that URL path if some useful utility existed.

Luckily, someone did recommend a library that provided an svn binding for python. I explored pysvn and found it provided exactly what I needed. The documentation for pysvn is outstanding. It gave all the examples and descriptions that I needed to easily write a function to retrieve the tag name from the working directory’s svn URL.

My first step was to install the pysvn bindings. Unfortunately, I didn’t see it available in PyPI, so I had to download and install it (thankfully on Ubuntu, it’s a simple sudo apt-get python-svn call, as described here).

I used the pysvn libraries to retrieve the working directory’s svn URL, the builtin urlparse library to parse the path from the full url, and finally the posixpath library to get the ‘basename’ which is the tagname.

Here’s my final code for doing so:

import pysvn
import urlparse, posixpath

URLPARSE_PATH_INDEX = 2

def find_basepath_name():
    # initialize the pysvn Client object, which contains all the functions for
    # working with the svn repo, including checkout, add, and status
    client = pysvn.Client()

    # grab the info from the current working directory
    entry = client.info('.')

    # parse the results of the URL to which the working directory is associated
    url_details = urlparse.urlparse(entry.url)

    # grab the 'path' component of the parsed url
    path = url_details[URLPARSE_PATH_INDEX]

    # use the posixpath module to correctly parse out the basename
    basename = posixpath.basename(path)
    return basename

if __name__ == '__main__':
    basename = find_basepath_name()
    print basename

I was listening to a podcast today of Scott Hanselman interviewing Michael Feathers, the author of Working Effectively with Legacy Code. I’m currently reading the book, and I highly recommend it. Feathers said something that resonated with me very strongly because it’s something I observe quite frequently at the different customer’s I’ve worked for and in the different software-development groups of which I’ve been a part. He said:

“I think it’s one of the worst mistakes you can make in software development, to feel that design is over. It’s really a continual process. Even in the older systems, you should be creating new functions and creating new classes.”

“Design is a continual process of examination and re-examination and evaluation.”

I have observed that most development in industry is maintenance development. Although it would be nice, we don’t always get the chance to be part of a greenfield project, one in which we can start a design from scratch. But the implicit undertone of “not being able to start a design from scratch” is that new design takes place only on new projects! This is NOT the case! New design can emerge anytime you touch a code base, whether it’s legacy code or new code.

How many times have you been tasked to change the functionality of a module and you proceed by going in and adding logic to an existing function or adding methods to already-existing classes? On the other hand, how many times have you gone in and created new classes? Or broken out functionality from existing methods to new, smaller, methods? In other words, how many times have you added new abstractions to convey the thoughts behind the new functionality?

That’s what Feathers refers to as design not being over, and I agree. You should constantly be looking for abstractions in your code, and altering them if they don’t completely agree with the functionality. If you’re looking for new abstractions, or rethinking and reevaluating the old abstractions, you’re still taking part in the design of the system.

Doing the opposite usually is detrimental to the code.

Altering functionality to existing functions by adding more code without thinking about how that problem can be broken down and abstracted properly further bloats the code and makes it even harder to abstract out the important parts in the future. Furthermore, it encourages the next developer to just do the same thing you did, i.e., continue to add more code to the existing method making it more complex, more bloated, and more unmaintainable. This bloated code works to compromise the design.

Bloated code attracts and desires MORE bloated code.

You should constantly be on the lookout for ways you can improve the design. Anytime you touch the code base, your changes can work to improve or work to destroy the design of a system.

Recently, a reader spotted a comment that I made on a different blog concerning generalizing data-access operations for inappropriate reasons. What follows was his email and my response:

Christian,

You made the following comment on this blog:

I’ve seen many examples similar to this that try too hard to fight an separation of concerns issue by ‘generalizing’ their data-access operations way too much.

Could you tell me why this approach is trying too hard?

Thanks,

Bob

Bob,

Here’s what I meant by that statement.

Data Access Objects (DAO) are used for primitive access operations against an abstracted data store (database, web services, RMI, etc, etc). Those primitive operations include insert, update, delete, query. I think the DAO should be limited to those operations, and only those operations, with an emphasis on very simple queries for the ‘query’ operation (e.g., “findByID”). The purpose of this reasoning is to keep the functionality on the DAO very focused on its limited responsibility. Adding methods such as “findByStreetAddressAndZip” or “queryForByAccountBalance” etc, prove to muddy the responsibility of a DAO.

However, those types of methods will most likely be necessary. But the context in which they are necessary helps illuminate the best place to put them. For example, in an architecture where the domain layer is nicely separated from the rest of the supporting software, and all domain logic resides there, those domain objects will need to retrieve data from a data store. Enter the repository pattern. (http://thinkddd.com/glossary/repository/) The repository pattern would provide the glue between the data store and your domain objects… you will find methods such as findByStreetAddres or findLastTransaction, etc, etc, but they will be completely related to the business logic and operations. These methods in the repository will provide a very explicit “seam” or contract between your business logic and the data store.

A lot of times the very subtle distinction between a “seam between business logic and data store” becomes muddied to mean “a seam between your software and data store’. This manifests itself when the repositories act as a seam between the business logic as well as the seam for the user interface. This is what leads to the explosion of “findBy…” “findWhenThisIsTrue”, etc methods. The user interface is constantly querying for data to display to the user, but is a repository the best place to put those methods? A repository is to act as a seam between the business logic and the data store, not the entire software (GUI) and the data store.

When the UI is also being fed by the repository objects, developers then cook up the reason for wanting to generalize the repository methods into findBy(Query). What this does is delude any sort of seam or contract you had with the business logic/data store, as well as open up the repository/data access to mean anything.

The fact is, retrieving data from a data store for the purpose of painting the UI is a DIFFERENT CONCERN than retrieving data from the data store to support the business logic. Coming to this realization can help simplify a design by putting the appropriate logic in appropriate parts of the architecture and keeping their responsibilities focused.

A simple example would look like this:
The domain module would contain repositories that implemented very specific methods for retrieving data from a data store in support of the domain operations. Generalized queries to support any UI functions would be disallowed. The domain layer would know nothing about the UI and what data it displays.

A UI module would have separate data-access classes specifically to support displaying data in the UI. Even if this approach *appears* to introduce some “duplication.” You may find folks take up crusades against this ‘duplication’ because they don’t understand the separation of concerns in this architecture. The *contexts* in which these objects are being used are completely different, therefore the objects should not be considered to be the same. The objects and data-access classes that support the UI will be able to change independently of the domain layer. They can be modified endlessly without any worry of breaking the domain. They can come from the same database/data store that the domain layer gets its data, or it can come from a completely separate database/data store in much larger applications (reporting or read-only DBs). This will allow your app to scale substantially better than if the UI and business logic operations/objects were all sloppily mangled together.

I realize parts of my explanation may need further examples or commentary, so please let me know where to expound if necessary.