Friday, February 29, 2008

GWT & object serialization

When writing GWT business applications you have the problem that the object graph resided at the server has to be partially send to the browser/client, (perhaps) updated and send back.

A though problem to crack is how to break the graph in pieces. After all, we do not want to send the entire database from the server to the client. This is similar to the problem that Hibernate solves when communicating with the database. Hibernate will load on-demand needed objects from the database. Hibernate will also optimize the updates to the database.

I think it is key to have something similar to Hibernate for the GWT arena. Instead of manually loading additional objects a framework should handle this.

A key difference is that (in most cases) the Hibernate instance is a singleton, thus allowing caching techniques. Caching at the browser is tricky since other clients can have updated the data.

My hope was that a framework like XSTM would solve this problem. The sad things it that it uses code generation. I'm not in favor of code generation and especially not when it involves the domain objects. Reason is that it makes the usage of the library very intrusive through your entire application. Advantage of code generation in this case instead of byte code manipulation is of course that it runs in GWT.

Anyway when I look at the sample the objects replicated are stored in a share, which is a generic container and thus requiring a lot of casting. My idea that you could retrieve objects with a query kind of language is not the case, you have to manually put in the share and get it out. So this solution will work well for games I suspect - but not for business applications with huge databases. In this case you have to write code at client side to notify the server which objects to put in the share. Again plumming that I hoped the framework would fix. Looks like that XSTM requires an additional layer of abstraction on top of it that will make it hide the clouds/shares and just work with regular objects.

Saturday, February 16, 2008

Cross-region software development

At my company we have been struggling finding out the optimal partitioning of the source tree of a product that is developed in three different countries (US, India and NL).

Some people prefer to have a single source tree. Others (among myself) prefer to have a stricter separation between the teams in different locations and time zones. One of the major reasons is that if the build is broken or disfunctional caused by team A, it impacts the productivity of team B and C. The developers of team B or C enter in the morning and face a build in which they cannot check. First the failure has to be solved before continuing regular development. Admitted, this can be done by a single person, but still the broken build impacts all the developers of a team since they can't check-in and might have updated from source control before realizing something was wrong. Since a team performs optimal when it is running at constant speed - similar to a car's engine - you need to avoid peaks and breaks.

Another reason for keeping sources separate is that you want to avoid mixing integration and regular development cross teams. Otherwise you cannot isolate external factors easily. It is much more efficient when you have accomplished your work of team development, all tests are passing and then integrate new releases of other teams or 3'rd party artifacts. If something fails, the cause is much easier to spot then when you also have changed your code.

Lesson: separate daily development from integration. We effectuated this (or tried to) by giving each team and the application its own Maven artifact. Admitted, you need Maven mavens to get the whole thing working. (BTW we never got to the point where we could fully demonstrate this Maven approach to be more efficient, in the past we used branches)

An often heard counter argument for having separate source trees is the price you have to pay for refactorings, especially renaming classes / methods. This can be solved in the following way: at the end of an iteration, the teams have released their artifacts and hopefully the product is working with all the parts integrated. So the start of the next iteration is an excellent time to get all the source code into a single view / project and refactor some elements. Yes, indeed you do have to wait. That is a small price to pay. However note that for really big products, getting all the source code into a single project probably makes the IDE too slow. This is something you do not want to work with the whole day. Besides renaming there is also a refactoring like changing the signature. The same can be accomplished with deprecating and introducing new methods.

Another mistake often made is the focus on compilation - which can easily be accomplished with a single source tree. IF the code compiles together then we are OK. But compiling is only just a small part of creating the product, much more important is to have a stable product and you know what is in it. That the code compiles together, does not mean a thing. It gets much better if the artifacts have passed their unit tests, are released and composed into a new product - that should pass some automated acceptance test.

There is much more to this discussion, I will have to add that another time.

Panta rhei

Everything flows - Heraklitus.

Well known fact; when you write down the specs, the world is changing. By the time you've finished the implementation the world is no longer the same. This why we all went agile.

There is another side to this; with an agile approach you take shortcuts in the first iteration. Most simple thing you can do, blabla bla. However a couple of iterations down the road, you forgot you took some shortcuts. People start cursing the whacky implementation, forgetting that this was a deliberate choice and now the time has come to crank up the implementations.

Iterations also pose a challenge for QA. In my experience they are still used to the waterfall process. It is hard for them; each iteration new functionality is delivered but still some things are not (yet) fully functional. The deliverable of each iteration has to be carefully defined to enable them to properly test the deliverable. However, how much time you are going to invest writing a detailed document that is obsolete after the next iteration? Worse, how are you going to get the full spec after all iteration are finished? How efficient are UI designers if they know that each iteration the UI will change? Do they start with a wizard or tab-based approach because they know the next iteration requires more widgets?

There is some kind of impedance mismatch between the development process (iterative) and the product as itself. As usual a trade-off.

Tuesday, February 12, 2008

Scala

I've been reading the “Scala for Java Refugees” series. I'm amazed by the power of this language. Tuples, closure, etc, everything that annoys me about Java is 'fixed'. The language itself is a bit overwhelming, reminds me of university with its magical and impressive calculus notations.

Reading on the implicit type conversion in serie 6, I got similar feeling when reading on AOP for the first time. It might become complex to understand the code, especially in the case of Scala where a smart compiler can find paths the developer has not thought off.

Probably the answer is the same as with AOP, use it scarce, wise and effective. In other words not all over the place, but where it pays off, eg security, transactions, all those clear cross cutting concerns.

I'm impressed by the expressive power of the language. On ScalaTest the expect construct is close to natural language (ok, a bit exaggerated). The freedom it gives to clearly state the intent, is something you never get with Java.

Thursday, February 7, 2008

JODReports and MS-Word

I finally implemented my document generation application using JODReports. The basic idea is as follows; user creates a bunch of templates using the JODReports variable mechanism and stores them in a certain folder. My application picks the templates up, converts to OO format, merges it with the data from the database, converts it back to Word and writes the files to a certain folder.

Works pretty good so far. A couple of issues I stumbled into:
  • Editing the Word document by pasting a variable name from html (in my case the instructions page in the browser) gives weird Freemarker parse exceptions. Pasting as plain text workarounds the problem.
  • In the supplied merge data the zip code field is separated by two spaces from the city name. However the final merged document contains one space. This is also the case if the data contains three spaces...
  • Default error handling from Freemarker prints out to the console. There is documentation how to fix this, but requires more research.
  • JODReports code is not well maintained. The code of the converter and the report are out of sync (watch the package names). Luckily this does not pose a problem when using it.
It is a bit of a pity that JODReports does not support callbacks. Now you have to pass in all data up front. It would be nicer if JODReports would do a callback asking for the value of a field. Reason is that I need to supply a couple of calculated fields. Now I have to supply them all up front, whilst perhaps none is used.