Tuesday, December 28, 2010

What I do



(I can be contacted by email or twitter).

I design and build software solutions that address business needs in the simplest possible way. I'm comfortable operating at the nexus of technology and commerce - bridging the gap between the software / hardware teams and the business drivers and key stakeholders, right up to board level.

Currently I'm the Chief Technology Officer for Eysys (we're hiring by the way!). At Eysys we're using big data combined with machine learning to build a next generation ecommerce platform, with baked-in intelligence to optimise conversion and make efficient use of marketing spend.

In a previous life, I was the Head of Data Engineering and Infrastructure at the Thomas Cook Online Travel Agency, using Master Data Management and big data analysis to drive platform conversion and performance.

Previous to that (sheesh!), I was the Chief Technology Officer for Comtec Group - building end to end systems for clients in the leisure travel industry, primarily in the UK and US. I led the definition and construction of our travel suite, from fast loading of  inventory (e.g. Hotel, Air, Transfers etc.) through GDS selection and with a particular focus on ecommerce. In the ecommerce world we helped our customers to measure and increase online conversion rates, optimize PPC spend, increase SEO scores and overall consumer engagement. We leveraged analytics, A/B with multivariate testing and personalization techniques, to name just a few tools and techniques in the kit bag.

Before Comtec I worked for a financial services company as a software architect and before that again I worked as a consultant for a well-known business and IT consulting company.

In 2000, I became an external examiner and subject matter expert for the Java Enterprise Architect accreditation from Sun Microsystems - now Oracle. I have presented at JavaOne and written numerous articles on many different aspects of software engineering. In 2010, I co-authored the definitive official study guide to the SCEA exam itself.

I am deeply rooted in Computer Science - I have a particular interest in distributed systems and hold a B.Sc (1998 - First Class Honours) and M.Sc (2002) in Computer Science from University College DublinMy M.Sc. thesis focused on building a high-throughput grid-like compute engine using Java and Artificial Neural Networks to solve a well-known bioinformatics problem (protein secondary structure prediction).

Friday, December 03, 2010

Early Xmas Cloud presents from Microsoft, Google..

Just about 48 hours apart, Microsoft and Google have released significant updates for their Azure and App Engine cloud offerings just in time for Christmas.

The 1.4.0 App Engine SDK addresses some long-criticised weaknesses, in particular not being able to keep an instance ready to rock and roll at all times plus the ability to execute long-running requests (> ten seconds). The ole App Engine has been getting a bit of a kicking recently in the blogosphere so this is a timely release (assuming the unplanned outages have been sorted out in parallel with this). There's nothing in the release notes about a more SQL-like persistence store like SQL Azure, so you still need to wrap your head around Google's Datastore and the pros and cons it gives you.

The 1.3 Azure SDK also addresses some weaknesses in Azure, in particular now allowing developers to actually RDP onto their Azure boxen in the cloud, a really big improvement on the current state of affairs (basically you get a headless box with non-straightforward access to log files via the Windows Azure Diagnostics service).

It's interesting how these SDK releases are solidifying the differences between these two cloud offerings - Google are zeroing in on providing a PaaS model, where you have to code in a supported programming language (currently either Java or Python - wonder when they will support Google Go?) against a locked-down set of APIs, where Microsoft are moving more towards an IaaS model where you do what you like cos it's more or less your box. Both approaches have their strengths and weaknesses, the overall ecosystem is stronger for having both.

Monday, September 27, 2010

The curious case of Oracle, the JDK and plan B (aka the prune juice plan)



Mark Reinhold (Chief Architect of the Java Platform Group at Oracle), posted a Plan A and B approach (just like a classic A/B ecommerce conversion test eh?!) for the JDK roadmap in advance of the annual Java love fest that is JavaOne in San Francisco last week. For me, this was the biggest item I was looking for - the time gap between JDK 6 and 7 has been ridiculous.

From his "Re-thinking JDK 7" post, the options proposed are:

<snip>

Plan A: JDK 7 (as currently defined) Mid 2012

Plan B: JDK 7 (minus Lambda, Jigsaw, and part of Coin) Mid 2011
JDK 8 (Lambda, Jigsaw, the rest of Coin, ++) Late 2012

</snip>

I am firmly in favour of the option eventually selected - option B. It's clear that the JDK has a huge feature log jam. Selecting option B is like giving the JDK release schedule a big dose of prune juice - you know something's gonna start moving.

So to understand what Plan B means for you as a Java architect, I suggest that it can be broken down into these four steps.

1. Read the negative comment to a further post by Mark announcing the decision - this comment represents why you would be unhappy with Plan B. I reproduce it here for the lazy reader (not you, the other guy):

"Hi Mark,


To me, "JDK 7 minus Lambda, Jigsaw and part of Coin" doesn't sound much like "Getting Java moving again" :-(


This schedule is very disappointing.


Posted by Cedric on September 08, 2010 at 10:06 AM PDT"

2. Read the response to the negative comment to understand what Plan B entails. Again, reproduced here:

"JDK 7 - (Lambda + Jigsaw + part of Coin) = Most of Coin + NIO.2 (JSR 203) +
InvokeDynamic (JSR 292) + "JSR 166y" (fork/join, etc.) + most everything else
on the current feature list (http://openjdk.java.net/projects/jdk7/features/) +
possibly a few additional features TBD.


Posted by Mark Reinhold on September 08, 2010 at 10:26 AM PDT"

The TBD bit is a tad ambiguous - let's ignore it by assuming nothing major is going to get in now, given the sheer volume of regression and platform testing needed before a JDK hits gold / GA status.

3. So now you know Project Coin is the biggie for JDK 7 - therefore you need to download presentation for same from this year's JavaOne 2010 session on Coin (119 slides, but a lot of these are just slides bitching about how hard it is to do, seminal slides are 10 and 23 - 66). Try-with-resources (Automatic Resource Management) looks great - equivalent to C#'s using keyword. Enhanced exception handling will enable better code as well.

4. [Optional, for the dedicated reader] Some more light bedtime reading - follow the links from the JDK 7 roadmap, especially for Project Lambda (closures) and Jigsaw (modular Java). This will then get JDK 8 on your forward-looking radar.

Now the **real** question is what will JEE 7 look like?!

Sunday, August 29, 2010

Umbraco CMS - complete install on Windows Azure (the Microsoft cloud)

We use the Umbraco CMS a lot at work - it's widely regarded as one of (if not the) best CMSs out there in the .NET world. We've also done quite a bit of R&D work on Microsoft Azure cloud offering and this blog post shares a bit of that knowledge (all of the other guides out there appear to focus on getting the Umbraco database running on SQL Azure, but not how to get the Umbraco server-side application itself up and running on Azure). The cool thing is that Umbraco comes up quite nicely on Azure, with only config changes needed (no code changes).

So, first let's review the toolset / platforms I used:

* Umbraco 4.5.2, built for .NET 3.5
* Latest Windows Azure Guest OS (1.5 - Release 201006-01)
* Visual Studio 2010 Professional
* SQL Express 2008 Management Studio
* .NET 3.5 sp1


Step one is simply to get Umbraco running happily in VS 2010 as a regular ASP.NET project. The steps to achieve this are well documented here. Test your work by firing up Umbraco locally, accessing the admin console and generating a bit of content (XSLTs / Macros / Documents etc.) before progressing further. (The key to working efficiently with Azure is to always have a working case to fall back on, instead of wondering what bit of your project is not cloud-friendly).

Then use these steps to make your Umbraco project "Azure-aware" . Again, test your installation by deploying to the Azure Dev Compute and Storage Fabric on your local machine and testing that Umbraco works as it should before going to production. The Azure Dev environment is by no means perfect (see below) or a true synonym for Azure Production, but it's a good check nonetheless.

Now we need to use the SQL Azure Migration Wizard tool to migrate the Umbraco SQL Express database. I used v3.3.6 (which worked fine with SQL Express contrary to some of the comments on the site) to convert the Umbraco database to its SQL Azure equivalent - the only thing the migration tool has to change is add a clustered index on one of the tables (dbo.umbracoUserLogins) as follows - everything else migrates over to SQL Azure easily:



CREATE CLUSTERED INDEX [ci_azure_fixup_dbo_umbracoUserLogins] ON [dbo].[umbracoUserLogins]
(
[userID]
)WITH (IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF)
GO

Then create a new database in SQL Azure and re-play the script generated by AzureMW into it to create the db schema and standing data that Umbraco expects. To connect to it, you'll replace a line like this in the Umbraco web.config:

<add key="umbracoDbDSN" value="server=.\SQLExpress;database=umbraco452;user id=xxx;password=xxx" />


with a line like this:

<add key="umbracoDbDSN" value="server=tcp:<<youraccountname>>.database.windows.net;database=umbraco;user id=<<youruser>>@<<youraccount>>;password=<<yourpassword>>" />

So we now have the Umbraco database running in SQL Azure, and the Umbraco codebase itself wrapped using an Azure WebRole and deployed to Azure as a package. If we do this using the Visual Studio tool set, we get:

19:27:18 - Preparing...
19:27:19 - Connecting...
19:27:19 - Uploading...
19:29:48 - Creating...
19:31:12 - Starting...
19:31:52 - Initializing...
19:31:52 - Instance 0 of role umbraco452_net35 is initializing
19:38:35 - Instance 0 of role umbraco452_net35 is busy
19:40:15 - Instance 0 of role umbraco452_net35 is ready
19:40:16 - Complete.

Note the total time taken - Azure is deploying a new VM image for you when it does this, it's not just deploying a web app to IIS, so the time taken is always ~ 13 minutes, give or take. I wish it was quicker..



Final comments

If you deploy and it takes longer than ~13 minutes, then double check the common Azure gotchas. In my experience they are:

1. Missing assemblies in production - so your project runs fine on the Dev Fabric and just hangs in Production on deploy - for Umbraco you need to make sure that Copy Local is set to true for cms.dll, businesslogic.dll and of course umbraco.dll so that they get packaged up.

2. Forgetting to change the default value of DiagnosticsConnectionString in ServiceConfiguration.cscfg (by default it wants to persist to local storage which is inaccessible in production - you'll need to use an Azure storage service and update the connection string to match, e.g. your ServiceConfiguration.cscfg should look something like this:

<?xml version="1.0"?>
<ServiceConfiguration serviceName="UmbracoCloudService" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration">
<Role name="umbraco452_net35">
<Instances count="1" />
<ConfigurationSettings>
<Setting name="DiagnosticsConnectionString" value="DefaultEndpointsProtocol=https;AccountName=travelinkce;AccountKey=youraccountkey/>
</ConfigurationSettings>
</Role>
</ServiceConfiguration>


You also need to run Umbraco in full-trust mode, otherwise you will get a security exception when Umbraco tries to read files that are not inside its own "local store" as defined by the .NET CAS (Code Access Security) sub system running on the production Azure VM. In other words, you need the enableNativeCodeExecution property set to true in your ServiceDefinition.csdef like so:

<?xml version="1.0" encoding="utf-8"?>

<ServiceDefinition name="UmbracoCloudService" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
  <WebRole name="umbraco452_net35" enableNativeCodeExecution="true">
    <InputEndpoints>
      <InputEndpoint name="HttpIn" protocol="http" port="80" />
    </InputEndpoints>
    <ConfigurationSettings>
      <Setting name="DiagnosticsConnectionString" />
    </ConfigurationSettings>
  </WebRole>
</ServiceDefinition>



The Azure development tools (Fabric etc.) are quite immature in my opinion - very slow to start up (circa one minute) and simply crash when you've done something wrong rather than give a meaningful error message and then exit (for example, when trying to access a local SQL Server Express database (which is wrong - fair enough), the loadbalancer simply crashed with a System.Net.Sockets.SocketException{"An existing connection was forcibly closed by the remote host"}. I have the same criticism of the Azure production system - do a search to see how many people spin their wheels waiting for their roles to deploy with no feedback as to what is going / has gone wrong. Azure badly needs more dev-friendly logging output.

I couldn't get the .NET 4.0 build of Umbraco to work (and it should, .NET 4.0 is now supported on Azure). The problem appears to lie in missing sections in the machine.config file on my Azure machine that I haven't had the time or inclination to dig into yet.

You'll also find that the following directories do not get packaged up into your Azure deployment package by default: xslt, css, scripts, masterpages. To get around this quickly, I just put an empty file in each directory to force their inclusion in the build. If these directories are missing, you will be unable to create content in Umbraco.


Exercises for the reader

* Convert the default InProc session state used by Umbraco to SQLServer mode (otherwise you will have a problem once you scale out beyond one instance on Azure). Starting point is this article - http://blogs.msdn.com/b/sqlazure/archive/2010/08/04/10046103.aspx, but google for errata to the script - the original script supplied does not work out of the box.

* Use an Azure XDrive or similar to store content in one place and cluster Umbraco.

Wednesday, August 18, 2010

Using Ninject as your Dependency Injection container in ASP.NET MVC 3

MVC 3 Preview 1 has been available for a few weeks now from Microsoft, with Preview 2 scheduled for release sometime next month.

As a web development framework, MVC 3 is pretty cool - simple to set up and start using, with a terse, clean syntax courtesy of the new Razor view engine. Coupled with Entity Framework 4 (supporting both code-first generation of database schemas and wrapping existing database schemas), MVC 3 + EF 4 has the makings of a very good web development stack.

If you're interested in using Ninject as the Dependency Injection (DI) container in MVC 3, then you'll find the code below interesting - I couldn't find this anywhere else on the web so ended up writing it. It's the required implementation of the System.Web.Mvc.IMvcServiceLocator that gets instantiated and used in the Application_Start method in Global.asax.cs.

Using DI with MVC 3 makes a lot of sense - we use it to decouple concrete implementations from the interface that we code against so that we can quickly swap in alternate implementations, e.g. a quick, self-contained in-memory database for unit testing using Moq or similar.

This link from Brad Wilson shows how to set up Microsoft Unity as the dependency injection container and this presentation from Phil Haack gives a fleeting, tantalising glimpse of how the Ninject equivalent might look but there's nowhere to get the complete code you need to get it working!

So I put the two together in order to use Ninject as my DI container. Here's the code (with zero comments as per my normal coding standard):

using System.Web.Mvc;
using System;
using System.Collections.Generic;
using Ninject;

namespace AdminApp.Models
{

public class NinjectMvcServiceLocator : IMvcServiceLocator
{
public IKernel Kernel { get; private set; }

public NinjectMvcServiceLocator(IKernel kernel)
{
Kernel = kernel;
}

public object GetService(Type serviceType)
{
try
{
return Kernel.Get(serviceType);
}
catch (Ninject.ActivationException e)
{
throw new System.Web.Mvc.ActivationException("PAK", e);
}
}


public IEnumerable<tservice> GetAllInstances<tservice>()
{
try
{
return Kernel.GetAll<tservice>();
}
catch (Ninject.ActivationException e)
{
throw new System.Web.Mvc.ActivationException("PAK", e);
}
}

public IEnumerable<object> GetAllInstances(Type serviceType)
{
try
{
return Kernel.GetAll(serviceType);
}
catch (Ninject.ActivationException e)
{
throw new System.Web.Mvc.ActivationException("PAK", e);
}
}

public TService GetInstance<tservice>()
{
try
{
return Kernel.Get<tservice>();
}
catch (Ninject.ActivationException e)
{
throw new System.Web.Mvc.ActivationException("PAK", e);
}
}

public TService GetInstance<tservice>(string key)
{
try
{
return Kernel.Get<tservice>(key);
}
catch (Ninject.ActivationException e)
{
throw new System.Web.Mvc.ActivationException("PAK", e);
}
}

public object GetInstance(Type serviceType)
{
try
{
return Kernel.Get(serviceType);
}
catch (Ninject.ActivationException e)
{
throw new System.Web.Mvc.ActivationException("PAK", e);
}
}

public object GetInstance(Type serviceType, string key)
{
try
{
return Kernel.Get(serviceType, key);
}
catch (Ninject.ActivationException e)
{
throw new System.Web.Mvc.ActivationException("PAK", e);
}
}


public void Release(object instance)
{
try
{
Kernel.Release(instance);
}
catch (Ninject.ActivationException e)
{
throw new System.Web.Mvc.ActivationException("PAK", e);
}
}



}
}





And here's how to instantiate and use it in Global.asax.cs:

var kernel = new StandardKernel(new NinjectRegistrationModule());
var locator = new NinjectMvcServiceLocator(kernel);
MvcServiceLocator.SetCurrent(locator);


Finally, here's a sample NinjectRegistrationModule which maps the implementation I want onto the generic interface that my code consumes:

using Ninject.Modules;
using AdminApp.Controllers;

namespace AdminApp
{
class NinjectRegistrationModule : NinjectModule
{
public override void Load()
{
Bind<ISpecialRepository>().To<DbSpecialRepository>().InRequestScope();
}
}
}

Friday, July 23, 2010

Effect of the SCEA study guide on the exam

The SCEA study guide book - especially chapter nine - is already having an effect on the exam. And that effect is interesting, mostly positive but with some negatives as well.

In general, it is fair to say that the overall standard of submissions has improved, and a lot of submissions clearly contain cues from chapter nine of the book - naming conventions, diagram layout, adoption of the server A and B spec approach for the deployment diagram - it's all there in a lot of submissions.

The book has made some of the submissions more anodyne / bland / standardized, which in turn makes me a little sentimental for the past. There's nothing like trying to traverse a crazy class diagram late at night for keeping your brain sharp!

In my opinion, a small but not insignificant percentage of candidates (a bit less than 10%) actually end up submitting a **worse** assignment under the influence of the book, and for a very interesting reason. If you buy the book and read it and aren't an architect, then you will have an incomplete understanding of the concepts covered within it. By extension, when you apply the book material to your submission, there is a very good chance that you will make mistakes that are pretty glaring. So the book will make your submission worse, not better.

As a corollary, if you buy the book and really get the material, your application of that new-found material on top of your already substantial knowledge and skills will result in a strong submission.

In summary then, the book is not a magic book.

The interesting medium / long-term question is whether or not the exam should always have a pass rate of X% and a fail rate of Y% or if it is acceptable to have X approach 100% as a result of the book (that's not happening but clearly it could).

Saturday, March 27, 2010

Book - feedback so far

The book has just gone back to the printers for a second run. Apparently the first print run (a few thousand I think?) was chewed up by Amazon and direct pre-orders. It's fantastic for that many people to have the book and I really hope it helps you in preparing for the exam.

So, the feedback so far: the reviews on Amazon (both .com and .co.uk) are for the old book, not the new one. Amazon just copied the reviews across (the last one was written two years before the new book published).

So all I've got to go on are comments that I've received directly. Broadly speaking, reviewers fall into two camps:

1. Those who like the ~200 page guide / map to a much larger body of research material (happy);

2. Those who want / expect to find all of the revision material in one book (not so happy).

Our goal was always to write a book that did not replicate the reams of material that exist for the JEE platform. We simply saw no point in doing that. Instead, we wanted to write a book that the candidate could use to:

1. Construct a revision schedule for Part One;

2. Understand how to approach Part Two - constructing your own solution for a given business problem using the JEE platform;

3. Prepare the candidate for Part Three - defending your Part Two submission and explaining how you solution satisfies various NFRs (non-functional requirements).

Broadly speaking, I think we've hit the goals we set. There is an errata list that will be sent to the publisher for the second print and will be published here as well for the purchasers of the first run.

Tuesday, January 26, 2010

SCEA book publication and shipment dates



The book has gone to the printers! It comes off the press on Monday February 1st and gets to Pearson's warehouse on February 4th. From there it usually takes a week to get to Amazon (in the US). Here's the Amazon US and Amazon UK links. It's also available for the Kindle.

People who placed pre-orders for hard copy editions will receive their shipment first - shipped direct from Pearson's warehouse next week.

As far as the online edition goes, the Rough Cut disappears after the final update (which matches the printed book) and it then becomes part of the regular Safari Library and is accessible to all subscribers.

If these dates change, I'll put out another update. It will be fantastic to see the book finally out there!