This tutorial shows how to merge multiple csv files into one in Java.

Suppose there are n number of csv files and each csv files having different number of headers also; so this example will show you how to merge multiple csv files into one file in java and write all the data to a single csv file using java.
Let’s say we have two csv files csv1.csv and csv2.csv with the following data.

csv1.csv

The first row in the following file contains the header names and subsequent rows contain values.

Notice here we have three header fields and corresponding values in subsequent rows but in other csv file we may have more or less header fields.

NAME,SURNAME,AGE
Fred,Krueger,Unknown

csv2.csv

As I said in the above that we may have more or less header fields in other csv file.

Here in the below file we see that we have four header fields and values in subsequent rows.

NAME,MIDDLENAME,SURNAME,AGE
Jason,Noname,Scarry,16

So if we merge the above two files then it should look like:

single.csv

MIDDLENAME,NAME,AGE,SURNAME
,Fred,Unknown,Krueger
Noname,Jason,16,Scarry

Look at the merged file single.csv which has only unique headers but data from both the files csv1.csv and csv2.csv.

Now look at the below java code how I am doing it.

I am using opencsv-2.2.jar file for reading from and writing to the csv files. You can download it easily or you can download the source code from the link at the end of this tutorial.

Define the model for csv file data. Each csv file may have different number of headers along with data so I have used here Map for putting the key/value pair so that it will be easier to retrieve later.

package com.jeejava.csv.model;
import java.util.HashMap;
import java.util.Map;
public class Record {
    private Map<String, String> values;
    public Record(String id) {
        this.values = new HashMap<String, String>();
    }
    public Map<String, String> getValues() {
        return values;
    }
    public void setValues(Map<String, String> values) {
        this.values = values;
    }
    public void put(String key, String value) {
        values.put(key, value);
    }
    public void get(String key) {
        values.get(key);
    }
}

Create the CSVParser.java which will actually do the neccessary processsing.

Here getCSVHeaders(File file) returns all the headers from a file and getAllCSVHeaders(List<File> files) merges all headers returned for each file from getCSVHeaders(File file).

I have used Set as a return type in getAllCSVHeaders(List<File> files) so that I get the unique headers for the merged csv file.

getAllRecords(File file, List<String> keys) returns a list of Record with each Record is kept in a map.

The method writeToSingleFile(File output, List<Record> records, Set<String> keys) writes to a single file. This method basically shows how to merge multiple csv files into one in Java.

package com.jeejava.csv.process;

import com.jeejava.csv.model.Record;

import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

import au.com.bytecode.opencsv.CSVReader;
import au.com.bytecode.opencsv.CSVWriter;

public class CSVParser {

    private Set getAllCSVHeaders(List files) {
        if (files != null && !files.isEmpty()) {
            Set headers = new HashSet();
            for (File file : files) {
                List headerList = getCSVHeaders(file);
                headers.addAll(headerList);
            }
            return headers;
        }
        return null;
    }

    private List getCSVHeaders(File file) {
        try {
            CSVReader csvReader = new CSVReader(new FileReader(file));

            // First line in the file is header
            String[] headers = csvReader.readNext();
            csvReader.close();
            if (headers != null && headers.length > 0) {
                List headerList = new ArrayList();
                for (int i = 0; i < headers.length; i++) {
                    headerList.add(headers[i]);
                }
                return headerList;
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }

        return null;
    }

    private List getAllRecords(File file, List keys) {
        try {
            CSVReader csvReader = new CSVReader(new FileReader(file),
                    CSVWriter.DEFAULT_SEPARATOR, '\'', 1);
            List<String[]> line = csvReader.readAll();
            csvReader.close();
            List records = new ArrayList();
            Record record = new Record(file.getName());
            for (String[] values : line) {
                for (int i = 0; i < values.length; i++) {
                    record.put(keys.get(i), values[i]);
                }
                records.add(record);
            }
            return records;
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }
    
    private void writeToSingleFile(File output, List records,
            Set keys) {
        try {
            CSVWriter csvWriter = new CSVWriter(new FileWriter(output),
                    CSVWriter.DEFAULT_SEPARATOR, CSVWriter.NO_QUOTE_CHARACTER);
            String[] headers = keys.toArray(new String[0]);
            csvWriter.writeNext(headers);
            for (Record record : records) {
                String firstVal = record.getValues().get(headers[0]);
                String secondVal = record.getValues().get(headers[1]);
                String thirdVal = record.getValues().get(headers[2]);
                String fourthVal = record.getValues().get(headers[3]);
                String[] line = new String[] { firstVal, secondVal, thirdVal,
                        fourthVal };
                csvWriter.writeNext(line);
            }
            csvWriter.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    
    public static void main(String[] args) {
        CSVParser csvParser = new CSVParser();
        String path = System.getProperty("user.dir");
        File f = new File(path + "/src"); // current directory
        List files = new ArrayList();
        File[] fs = f.listFiles();
        for (File file : fs) {
            if (!file.isDirectory()
                    && !file.getName().equalsIgnoreCase("single.csv")) {
                files.add(file);
            }
        }

        List records = new ArrayList();
        for (File file : files) {
            List keys = csvParser.getCSVHeaders(file);
            List fileRecords = csvParser.getAllRecords(file, keys);
            records.addAll(fileRecords);
        }

        Set keys = csvParser.getAllCSVHeaders(files);

        File output = new File(path + "/src/single.csv");

        csvParser.writeToSingleFile(output, records, keys);
    }
}

That’s all. Hope you got an idea on how to merge multiple csv files into one in Java. You may have more than two csv files and you can apply the same concept in your application to merge multiple csv files into one in java.

Thank you for reading.

Tags:

I am a professional Web developer, Enterprise Application developer, Software Engineer and Blogger. Connect me on Roy Tutorials | TwitterFacebook Google PlusLinkedin | Reddit

Leave a Reply

Your email address will not be published. Required fields are marked *