Having done similar work in the past, I used the Apache Velocity and POI libraries to write a handy utility. The Velocity library is a templating engine, so I can write a test script, insert some bind variables into it and have Velocity do my variable substitution for me. In addition, Velocity allows you to build logic into the templates, meaning based on our bind variables, the template can be generated in different ways! The second library, Apache POI is used to read in a spreadsheet, this spreadsheet contains all the bind variables.
Combining these two things together, you provide a template and a spreadsheet and the code below produces as many files as there are rows in the spreadsheet, each file having the row cell values populated in the spreadsheet.
Starting off, lets define some member variables in our code. These are used to store the first and last rows numbers in the spreadsheet and first and last columns (cells). The spreadsheet itself and the first row of the spreadsheet are referenced as well as the template being used to generate the data.
private int firstRow;
private int lastRow;
private int firstCell;
private int lastCell;
Sheet sheet;
Row firstRowObj;
private Template template;
The main() method does some basic house keeping like checking whether the files that are passed in as command line arguments exist, and then initialises the class and calls the generate() method.
public static void main(String[] args) throws Exception {
if (args.length != 2) {
System.out.println("usage: VelociPoi <spreadsheet> <template>");
System.exit(1);
}
final File spreadsheetFile = new File(args[0]);
final File templateFile = new File(args[1]);
if (!(spreadsheetFile.exists() && spreadsheetFile.canRead())) {
System.out.println("Spreadsheet file does not exist or cannot be read: " +
spreadsheetFile.getName());
System.exit(1);
}
if (!(templateFile.exists() && templateFile.canRead())) {
System.out.println("Template file does not exist or cannot be read: " +
templateFile.getName());
System.exit(1);
}
VelociPoi vp = new VelociPoi(spreadsheetFile, templateFile);
vp.generate();
}
In the constructor, the spreadsheet is loaded, the first/last rows and columns are worked out, Velocity is initialised and the template is loaded.
The first row of the spreadsheet is stored as a member variable because this effectively contains all of the variable names (or keys) that we are going to be using in the template. The code later on uses these variable names inside the Velocity context to bind the variables to their values.
private VelociPoi(final File spreadsheetFile, final File templateFile)
throws FileNotFoundException, IOException, InvalidFormatException {
// read in the spreadsheet and fetch the first work sheet in the file
InputStream inputStream = new FileInputStream(spreadsheetFile);
Workbook workBook = WorkbookFactory.create(inputStream);
sheet = workBook.getSheetAt(0);
// get the first and last row numbers
firstRow = sheet.getFirstRowNum();
lastRow = sheet.getLastRowNum();
// fetch the first row so we can read in the bind parameter names
firstRowObj = sheet.getRow(firstRow);
// get the first and last cell numbers within the row
firstCell = firstRowObj.getFirstCellNum();
lastCell = firstRowObj.getLastCellNum();
// initialise Velocity
InputStream inStream = VelociPoi.class.getResourceAsStream("/velocity.properties");
Properties props = new Properties();
// initialise Velocity
props.load(inStream);
Velocity.init(props);
// load the template
template = Velocity.getTemplate(templateFile.getName());
}
Velocity uses resource loaders when reading the templates, these need to be configured in the velocity properties file. I put this file into the classpath and load it as a classpath resource.
The contentes of this file are as follows:
resource.loader = file
file.resource.loader.class = org.apache.velocity.runtime.resource.loader.FileResourceLoader
file.resource.loader.path = /tmp
In the file above, the "file.resource.loader.path" property has to be set to the working directory where the code is running and where the template file is. If this is not set correctly, Velocity will not be able to load the template.
The generate() method is where the real work gets done. This method loops through all of the rows in the spreadsheet (excluding the first row). For each of the rows a new VelocityContext is created, this is where the variable names from the first row are bound to the variable values from the row that's in the loop.
Once all of the variables are bound to their corresponding values, the template and context are merged, generating the output into the writer that was specified. The writer writes the data to a file.
There is a bit of a hack to do with the text values and numeric values in cells. I found that sometimes Excel doesn't set the cell type correctly. Typically I only work with integer type values that are represented as text (they have leading zeros sometimes), so my hack is to convert all numeric type cells into integers.
private void generate() throws IOException {
for (int i = firstRow + 1; i <= lastRow; i++) {
Row row = sheet.getRow(i);
VelocityContext context = new VelocityContext();
String outputFile = null;
// for each row, set the key/value from the spreadsheet
// the first row provides the key names
for (int j = firstCell; j < lastCell; j++) {
final Cell cell = row.getCell(j);
// treat numeric cells as string cells
String value;
if (cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
value = "" + ((int) cell.getNumericCellValue());
}
else {
value = cell.getStringCellValue();
}
String key = firstRowObj.getCell(j).getStringCellValue();
context.put(key, value);
// set the output file name
if (key.equals("outputFile")) {
outputFile = value;
}
}
// assing variable values in the template and write to file
FileWriter writer = new FileWriter(outputFile);
template.merge(context, writer);
writer.close();
}
}
The spreadsheet looks like this:
To test this I used just a simple dummy template as follows.
$outputFile
$numResults
$id1
$id2
Running the above code produces two files, test1.dat and test2.dat. The content for these is:
test1.dat -
test1.dat
25
1
1111
test2.dat -
test1.dat
25
1
1111
Using this code we can now start producing many data files, or even code and scripts. The templates can also have logic built into them to generate slightly different data.
-i