Introduction

  • Spring Batch is a lightweight, comprehensive batch framework
  • It is designed to enable the development of robust batch applications
  • It builds on the productivity, POJO-based development approach
  • Spring Batch is not a scheduling framework
  • It is intended to work in conjunction with a scheduler but not a replacement for a scheduler.

Usages of Spring Batch

  • used to perform business operations in mission critical environments
  • used to automate the complex processing of large volume of data without user interaction
  • processes the time-based events, periodic repetitive complex processing for a large data sets
  • used to integrate the internal/external information that requires formatting, validation and processing in a transactional manner
  • used to process the parallel jobs or concurrent jobs
  • provide the functionality for manual or scheduled restart after failure

Guidelines to use Spring Batch

  • avoid building complex logical structures in a single batch application
  • keep your data close to where the batch processing occurs
  • minimize the system resource use like I/O by performing operations in internal memory wherever possible
  • cache the data after first read from database for every transaction and read cache data from next time onwards
  • avoid unnecessary scan for table or index in database
  • be specific to retrieve the data from database, i.e., retrieve the required fields only, specify WHERE clause in the SQL statement etc.
  • avoid performing the same thing multiple times in a batch processing
  • allocate enough memory before batch process starts because reallocating memory is a time-consuming matter during the batch process
  • be consistent to check and validate the data to maintain the data integrity
  • Implement check-sums for internal validation wherever possible
  • stress test should be executed at early stage for production-like environments

For more information on Theoretical parts please go to http://docs.spring.io/spring-batch/trunk/reference/html/spring-batch-intro.html and http://spring.io/guides/gs/batch-processing/

Now we will see an example how it works

What we will do

We’ll build a service that imports data from a CSV spreadsheet, transforms it with custom code, and stores the final results in another CSV spreadsheet. You can also store data in database or any persistence storage.

Prerequisites

Any Java based IDE
JDK 1.6+
Maven 3.0+

Step 1. Create Maven project(standalone or quickstart) in Eclipse IDE and necessary project structure gets created

Step 2. Modify pom.xml file so that it looks like below. It downloads all jars from maven repository.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>in.webtuts</groupId>
    <artifactId>spring-batch</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <name>spring-batch</name>
    <url>http://maven.apache.org</url>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <!-- Inherit defaults from Spring Boot -->
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>1.1.6.RELEASE</version>
    </parent>

    <dependencies>
        <dependency>
            <groupId>org.springframework.batch</groupId>
            <artifactId>spring-batch-core</artifactId>
            <version>3.0.1.RELEASE</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-batch</artifactId>
        </dependency>
    </dependencies>

    <build>
        <resources>
            <resource>
                <directory>src/main/resources</directory>
            </resource>
        </resources>
    </build>
</project>

 

Step 3. Create a business class User.java which will represent a row of data for inputs and outputs. You can instantiate the User class either with name and email through a constructor, or by setting the properties.

package in.webtuts.spring.batch.bo;

public class User {

    private String name;
    private String email;

    public User() {
    }

    public User(String name, String email) {
        this.name = name;
        this.email = email;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getEmail() {
        return email;
    }

    public void setEmail(String email) {
        this.email = email;
    }

    @Override
    public String toString() {
        return "name: " + name + ", email:" + email;
    }

}

Step 4. Create an intermediate processor. A common paradigm in batch processing is to ingest data, transform it, and then pipe it out somewhere else. Here we write a simple transformer that converts the names to uppercase and changes the email domain.

package in.webtuts.spring.batch.itemprocess;

import in.webtuts.spring.batch.bo.User;

import org.springframework.batch.item.ItemProcessor;

public class UserItemProcessor implements ItemProcessor<User, User> {

    @Override
    public User process(final User user) throws Exception {
        final String domain = "roytuts.com";
        final String name = user.getName().toUpperCase();
        final String email = user.getEmail().substring(0,
                user.getEmail().indexOf("@") + 1)
                + domain;
        final User transformedUser = new User(name, email);
        System.out.println("Converting [" + user + "] => [" + transformedUser
                + "]");
        return transformedUser;
    }

}

UserItemProcessor implements Spring Batch’s ItemProcessor interface. This makes it easy to wire the code into a batch job that we define further down in this guide. According to the interface, we receive an incoming User object, after which we transform name to an upper-cased name and we replace the email domain by roytuts.com in User object.

Step 5. Now we will write a batch job. We use annotation @EnableBatchProcessing for enabling memory-based batch processing meaning when processing is done, the data is gone.

package in.webtuts.spring.batch.configuration;

import in.webtuts.spring.batch.bo.User;
import in.webtuts.spring.batch.itemprocess.UserItemProcessor;
import in.webtuts.spring.batch.mapper.UserFieldSetMapper;
import in.webtuts.spring.batch.utils.CSVUtils;

import java.io.File;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.FlatFileItemWriter;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor;
import org.springframework.batch.item.file.transform.DelimitedLineAggregator;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.FileSystemResource;

@Configuration
@EnableBatchProcessing
public class UserBatchConfiguration {

    @Bean
    //creates an item reader
    public ItemReader<User> reader() {
        FlatFileItemReader<User> reader = new FlatFileItemReader<User>();
        //look for file user.csv
        reader.setResource(new FileSystemResource(System
                .getProperty("user.dir")
                + File.separator
                + "src/main/resources/user.csv"));
        //line mapper
        DefaultLineMapper<User> lineMapper = new DefaultLineMapper<User>();
        //each line with comma separated
        lineMapper.setLineTokenizer(new DelimitedLineTokenizer());
        //map file's field with object
        lineMapper.setFieldSetMapper(new UserFieldSetMapper());
        reader.setLineMapper(lineMapper);
        return reader;
    }

    @Bean
    //creates an instance of our UserItemProcessor for transformation
    public ItemProcessor<User, User> processor() {
        return new UserItemProcessor();
    }

    @Bean
    //creates item writer
    public ItemWriter<User> writer() {
        FlatFileItemWriter<User> writer = new FlatFileItemWriter<User>();
        //get the output file
        writer.setResource(new FileSystemResource(System
                .getProperty("user.dir")
                + File.separator
                + "src/main/resources/transformed_user.csv"));
        //delete if the file already exists
        writer.setShouldDeleteIfExists(true);
        //create lines for writing to file
        DelimitedLineAggregator<User> lineAggregator = new DelimitedLineAggregator<User>();
        //delimit field by comma
        lineAggregator.setDelimiter(",");
        //extract field from ItemReader
        BeanWrapperFieldExtractor<User> fieldExtractor = new BeanWrapperFieldExtractor<User>();
        //use User object's properties
        fieldExtractor.setNames(new String[] { "name", "email" });
        lineAggregator.setFieldExtractor(fieldExtractor);
        //write whole data
        writer.setLineAggregator(lineAggregator);
        return writer;
    }

    @Bean
    //define job which is built from step
    public Job importUserJob(JobBuilderFactory jobs, Step step) {
        //need incrementer to maintain execution state
        return jobs.get("importUserJob").incrementer(new RunIdIncrementer())
                .flow(step).end().build();
    }

    @Bean
    //define step
    public Step step1(StepBuilderFactory stepBuilderFactory,
            ItemReader<User> reader, ItemWriter<User> writer,
            ItemProcessor<User, User> processor) {
        //chunk uses how much data to write at a time
        //In this case, it writes up to five records at a time.
        //Next, we configure the reader, processor, and writer
        return stepBuilderFactory.get("step1").<User, User> chunk(5)
                .reader(reader).processor(processor).writer(writer).build();
    }

    @Bean
    //we get the CSVUtils object for file related utility operations like reading from csv file and writing to csv file
    public CSVUtils csvUtils() {
        return new CSVUtils();
    }
}

Step 6. This batch processing can be embedded in web apps also but here we will create a main method to run the application. You can also create an executable jar from it.

package in.webtuts.spring.batch.test;

import in.webtuts.spring.batch.utils.CSVUtils;

import java.io.File;
import java.net.URISyntaxException;
import java.util.List;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.EnableAutoConfiguration;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.ComponentScan;
import org.springframework.context.annotation.Configuration;

@Configuration
@EnableAutoConfiguration
@ComponentScan({ "in.webtuts.spring.batch.bo",
        "in.webtuts.spring.batch.configuration",
        "in.webtuts.spring.batch.itemprocess",
        "in.webtuts.spring.batch.mapper", "in.webtuts.spring.batch.utils" })
public class SpringBatchTest {

    /**
     * @param args
     * @throws URISyntaxException
     */
    public static void main(String[] args) throws URISyntaxException {
        ApplicationContext applicationContext = SpringApplication.run(
                SpringBatchTest.class, args);
        File file = new File(System.getProperty("user.dir") + File.separator
                + "src/main/resources/transformed_user.csv");
        if (file != null) {
            List<String[]> records = applicationContext.getBean(CSVUtils.class)
                    .getCSVRecords(file);
            if (records != null) {
                for (String[] record : records) {
                    System.out.println(record[0] + ", " + record[1]);
                }
            }
        }
    }

}

Step 7. Create CSVUtils.java and UserFieldSetMapper.java

package in.webtuts.spring.batch.utils;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;

public class CSVUtils {

    public List<String[]> getCSVRecords(File file) {
        List<String[]> records = null;
        BufferedReader bf = null;
        try {
            if (file != null) {
                String line = null;
                bf = new BufferedReader(new InputStreamReader(
                        new FileInputStream(file)));
                records = new ArrayList<String[]>();
                int i = 0;
                while ((line = bf.readLine()) != null) {
                    if (i == 0) {
                        i++;
                        continue;
                    }
                    String[] record = line.split(",");
                    records.add(record);
                }
            }
        } catch (FileNotFoundException ex) {
            ex.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (bf != null) {
                try {
                    bf.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
        return records;
    }

    /**
     * @param args
     */
    public static void main(String[] args) {
        CSVUtils csvUtils = new CSVUtils();
        File file = new File(System.getProperty("user.dir") + File.separator
                + "src/main/resources/user.csv");
        if (file != null) {
            List<String[]> records = csvUtils.getCSVRecords(file);
            for (String[] record : records) {
                System.out.println(record[0] + ", " + record[1]);
            }
        }
    }

}
package in.webtuts.spring.batch.mapper;

import in.webtuts.spring.batch.bo.User;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

public class UserFieldSetMapper implements FieldSetMapper<User> {

    @Override
    public User mapFieldSet(FieldSet fieldSet) throws BindException {
        User user = new User();
        user.setName(fieldSet.readString(0));
        user.setEmail(fieldSet.readString(1));
        return user;
    }

}

 

Output

2014-09-19 11:16:25.263  INFO 8156 --- [main] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=importUserJob]] launched with the following parameters: [{run.id=1}]
2014-09-19 11:16:25.279  INFO 8156 --- [main] o.s.batch.core.job.SimpleStepHandler     : Executing step: [step1]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
Converting [name: abc, email:abc@abc.com] => [name: ABC, email:abc@roytuts.com]
2014-09-19 11:16:25.326  INFO 8156 --- [main] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=importUserJob]] completed with the following parameters: [{run.id=1}] and the following status: [COMPLETED]
2014-09-19 11:16:25.326  INFO 8156 --- [main] i.w.spring.batch.test.SpringBatchTest    : Started SpringBatchTest in 1.968 seconds (JVM running for 2.342)
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com
ABC, abc@roytuts.com

That’s all. Thanks for your reading.

Tags:

I am a professional Web developer, Enterprise Application developer, Software Engineer and Blogger. Connect me on Roy Tutorials | TwitterFacebook Google PlusLinkedin | Reddit

Leave a Reply

Your email address will not be published. Required fields are marked *