Tuesday, January 26, 2010

Running Groovy Code from XML Files

For an upcoming Java project, I'm having a look at Groovy. An obvious first opportunity for using it is unit tests. Here's another use case I just made up (that has nothing to do with the project whatsoever ;). Suppose you have two domain objects. One representing a mailing address, maybe stored in a database. The other one representing some kind of form that should be filled from the available data, which includes, among other things, an address. So you might come up with an XML format that maps fields from one object to the other. Maybe one contains a field "city" and the other one a field named "town" and you have something like

<mapping>
    <destination>town</destination>
    <source>city</source>
<mapping>

That's not very exciting. But then you have something like "streetName" and "streetNumber" in one object and "street" in the other one. Now you need to do some simple string manipulation to map fields from source to destination. So, extend the XML! Maybe like this?

<mapping>
    <destination>street</destination>
    <source>
        <combine>
            <field>streetName</field>
            <field>streetNumber</field>
        </combine>
    </source>
</mapping>

Doesn't look so bad, does it? Until you need to do the opposite, split up a field. You end up extending your XML format, slowly implementing a programming language. In XML. Not exactly pretty, IMHO. And lot's of pointless work to implement. What's the alternative? Don't use XML if you need logic! Use a *programming* language! so you write the mappings in Java. This  is very time consuming as well. Suppose you want to generate some of these mappings automatically. Maybe give your users a tool to create them easily, with an intuitive interface. Generating Java code from that might not be such a good idea.

Ok, enough of coming up with reasons and examples why you'd want to do this. Let's just agree, that we want to embed little snippets of Groovy code in an XML mapping document to perform simple transformations of source to destination fields. The scope in which the snippets are run should contain the source object and they should return a string, which is the content of the destination field. Our domain objects might look like this:

public class Address {
    private HashMap<String, String> fields = new HashMap<String, String>();

    public Address() {
        setField("firstName", "");
        setField("lastName", "");
        setField("street", "");
        setField("zipcode", "");
        setField("city", "");
    }

    public Address(
                String firstName,
                String lastName,
                String street,
                String zipcode,
                String city) {
        setField("firstName", firstName);
        setField("lastName", lastName);
        setField("street", street);
        setField("zipcode", zipcode);
        setField("city", city);
    }

    public void setField(String key, String value) {
        fields.put(key, value);
    }

    public String getField(String key) {
        return fields.get(key);
    }

    public String toString() {
     return fields.toString();
    }
    
    public String getFirstName() {
        return getField("firstName");
    }

    public void setFirstName(String firstName) {
        setField("firstName", firstName);
    }

    public String getLastName() {
        return getField("lastName");
    }

// ...

and this:

public class FormWithAddress {
    private HashMap<String, String> fields = new HashMap<String, String>();

    public FormWithAddress() {
        setField("name", "");
        setField("streetName", "");
        setField("streetNumber", "");
        setField("zipcode", "");
        setField("city", "");
    }

    public FormWithAddress(
                String name,
                String streetName,
                String streetNumber,
                String zipcode,
                String city) {
        setField("name", name);
        setField("streetName", streetName);
        setField("streetNumber", streetNumber);
        setField("zipcode", zipcode);
        setField("city", city);
    }

    public void setField(String key, String value) {
        fields.put(key, value);
    }

    public String getField(String key) {
        return fields.get(key);
    }

    public String toString() {
     return fields.toString();
    }
    
    public String getName() {
        return getField("name");
    }

    public void setName(String name) {
        setField("name", name);
    }

    public String getStreetName() {
        return getField("streetName");
    }

    public void setStreetName(String streetName) {
        setField("streetName", streetName);
    }


// yadda yadda yadda

So a mapping with a Groovy snippet in it might look like this:

<mapping>
    <source></source>
    <execute><![CDATA[
        firstName = address.getField("firstName")
        lastName = address.getField("lastName")
        return "${lastName}, ${firstName}".toString()
    ]]></execute>
    <destination>name</destination>
</mapping>

Now on to the actual point of this blog post. How do we execute the Groovy code that we pull out of the XML file as a string? There are several ways to run Groovy code from Java. As far as I know, some of them allow having a restricted security context. If you take arbitrary text from the network and run it as Groovy scripts, you want to make absolutely sure, that it's not possible to call System.exit() or worse (much, much worse ;). Come to think about it, you probably never want to do something like that. Anyway, we'll just look at the one I tried first and found to be working just fine; GroovyShell. For more information about available options have a look at these docs or this blog post.

Say we have a class that does the mapping. It reads the XML file, pulls data out of the source object, puts it into the destination object and when it encounters a non-empty tag, it runs the code inside. For the sake of loose coupling, we'll introduce a horrendously named interface that we can use as parameters for source and destination objects.

public interface FieldsGettableSettable {
    public String getField(String key);
    public void setField(String key, String value);
}

When I made up this example, I didn't realize that it would either require something like this setField()/getField() stuff or reflection. Well, maybe there's another, better way. I didn't think about this all that much. If you're used to coding in Python, you tend to take it for granted to be able to seamlessly go back and forth between strings containing the names of and actual functions/method/members. Gotta get back into the Java mindset I guess. Anyway, a rough sketch of the mapper class:

public class FormMapper {


    public FormMapper(FieldsGettableSettable from, FieldsGettableSettable to,
                      String mappingFilename) {
        // ...
    }
 
    public void performMapping() {
        // ...
    }
 
    private void mapField(String source, String destination, String execute) {
        if ((source != null && !source.trim().equals("")) &&
            (destination != null && !destination.trim().equals(""))) {
            to.setField(destination, from.getField(source));
        } else {
            applyTransformation(execute, destination);
        }
    }

    private void applyTransformation(String transformation,
                                     String destination)
                             throws CompilationFailedException {
            Binding binding = new Binding();
            binding.setVariable("address", from);
            GroovyShell shell = new GroovyShell(binding);
            Object result = shell.evaluate(transformation);
            assert result instanceof String;
            to.setField(destination, (String) result);
    }
}

So there it is, inside applyTransformation(). Shouldn't come as a surprise, since all the documentation and blog posts on the subject contain this code. But since I had to do the legwork of putting this together anyway, I figured I might as well blog about it. Considering how little I've managed to blog at all so far... As a little bonus, you may now laugh at my very first Groovy code that I wrote to generate those domain classes that consist of 120% Pure Premium Boilerplate(tm). It's probably not idiomatic and could be written in half as many lines of code. The capitalizeFirstLetter() function looks particularily cumbersome to me. I just ran with the first thing that worked inside groovysh. The template mechanism that's built into Groovy strings is neat. I probably don't need to toString() calls, likewise in the XML file.


#!/usr/bin/env groovy

def capitalizeFirstLetter(str, result="") {
result += str[0].toUpperCase()
result += str[1..str.length()-1]
return result
}

def generateGetter(fieldName) {
methodName = capitalizeFirstLetter(fieldName, "get")
return """\
public String ${methodName}() {
return getField("${fieldName}");
}

"""
}

def generateSetter(fieldName) {
methodName = capitalizeFirstLetter(fieldName, "set")
return """\
public void ${methodName}(String ${fieldName}) {
setField("${fieldName}", ${fieldName});
}

"""
}

def generateDefaultConstructor(name, fields) {
result = " public ${name}() {\n";
for (field in fields) {
result += " setField(\"${field}\", \"\");\n"
}
result += " }\n\n"
return result
}

def generateFieldsConstructor(name, fields) {
result = " public ${name}(\n"
len = fields.size()
fields.eachWithIndex() { field, i ->
if (i+1 < len)
result += " String ${field},\n"
else
result += " String ${field}) {\n"
}

for (field in fields) {
result += " setField(\"${field}\", ${field});\n"
}
result += " }\n\n"
return result
}

def generateClass(name, pkg, fields) {
code = """\
package ${pkg};
import java.util.HashMap;

public class ${name} {
private HashMap<String, String> fields = new HashMap<String, String>();

"""
code += generateDefaultConstructor(name, fields)
code += generateFieldsConstructor(name, fields)

code += """\
public void setField(String key, String value) {
fields.put(key, value);
}

public String getField(String key) {
return fields.get(key);
}

public String toString() {
return fields.toString();
}

"""

for (field in fields) {
code += generateGetter(field)
code += generateSetter(field)
}

code += """\
}
"""
return code.toString()
}

if (args.size() < 3) {
print "usage: ./GenCls.groovy <Class Name> <Package> <Field 1> [Field 2] ..."
System.exit(1)
}
print generateClass(args[0], args[1], args[2..args.size()-1])

For what it's worth, all of this stuff is available at http://hg.zeropatience.net/groovyinxml/. Ah, right, an example... So we use the mapper class like this:

public class Main {

    public static void main(String[] args) {
        Address address = new Address("Homer",
                                      "Simpson",
                                      "742 Evergreen Terrace",
                                      "0xBADC0DE",
                                      "Springfield");
        
        FormWithAddress formWithAddress = new FormWithAddress();
        
        try {
            FormMapper formMapper = new FormMapper(address,
                                            formWithAddress,
                                            "Address2FormWithAddress.xml");
            formMapper.performMapping();
        } catch (Exception ex) {
            System.err.println(ex.getLocalizedMessage());
            System.exit(1);
        }
        
        System.out.println("\nMAPPED\n\n");
        System.out.println(address);
        System.out.println("\n\nTO\n\n");
        System.out.println(formWithAddress);    
    }
}

And get the following, exhilarating output:

MAPPED


{lastName=Simpson, zipcode=0xBADC0DE, street=742 Evergreen Terrace, firstName=Homer, city=Springfield}


TO


{zipcode=0xBADC0DE, name=Simpson, Homer, streetNumber=742, streetName=Evergreen Terrace,

city=Springfield}

That's all.

No comments:

Post a Comment