Filling the database with the test data

You have database schema. You have generated Java code (by OneWebSQL, of course). You have started developing an application. You're missing something -- the data in the database. Your database is empty. There are only tables, no rows.

SQL scripts

There a few approaches to fill the database with data, which is often referred to as initial data or test data. The first approach is obvious -- an SQL script that contains INSERT queries, such as this one:

insert into category (category_id, code, name) values (1, 'programming', 'Programming');
insert into category (category_id, code, name) values (2, 'renaissance', 'Renaissance Literature');

insert into book (book_id, isbn, edition_id, title, publication_year, number_of_pages, category_id) values (100, '978-99999000100', 1, 'Clean Code - A handbook of Agile Software Craftsmanship', 2005, 226, 1);
insert into book (book_id, isbn, edition_id, title, publication_year, number_of_pages, category_id) values (101, '978-99999000101', 2, 'Coders at Work', 2003, 126, 1);
Maintaining SQL scripts has many disadvantages. You have to know SQL. This not as common as you may think. You have to keep track of relations: primary key to foreign key equality. The data is somehow hidden in SQL queries, and it's easy to get lost.

 

CSV files

There is a second approach. Let's generate SQL queries based on pure data stored in a well-defined format. You may use XML, but a CSV file is the better choice. Let's look at an example:

category_id;code;name
1;"programming"; "Programming"
2;"renaissance"; "Renaissance Literature"
...

book_id; isbn; edition_id;title; publication_year; number_of_pages;category_id
100;'978-99999000100';1;'Clean Code - A handbook of Agile Software Craftsmanship';2005;226;1
101;'978-99999000101';2;'Coders at Work';2003;126;1
...
CVS files look clean; they can be edited by anyone and can be edited in a spreadsheet. However, if you have many rows in tables, then you have to copy and paste frequently. Keeping PK/FK integrity is a nightmare, just as in the previous approach.

 

Java code

Let's see how get a computer to do the job. Since you've got generated code, you can use it to fill the database. Here you have an example showing how to populate the category table:

Category createCategory(String name) {
        Category c = new Category();
        c.setName(name);
        c.setCode(name.toLowerCase().replace(" ", "_"));
        return c;
}
    
void generateCategories() {
        List<Category> categories = new ArrayList<Category>();
        
        categories.add(createCategory("Computer programming"));
        categories.add(createCategory("Programming languages"));
        categories.add(createCategory("System development"));
        // ...
        
        CategoryDAO categoryDAO = new CategoryDAOImpl(dataSource, dbAdapter, daoMonitor);
        categoryDAO.insert(categories);
}
If you want to have a large amount of categories, and you don't really need to make them look realistic, use a programming language as it is designed for :)
String[] word1 = new String[]{"", "advanced "};
String[] word2 = new String[]{"computer ", "system ", "human "};
String[] word3 = new String[]{"languages", "development", "programming ", "behaviour "};
      

List<Category> categories = new ArrayList<Category>();
        
for(int i = 0; i < word1.length * word2.length * word3.length; i++) {
   int i3 = i % word3.length;
   int i2 = (i / word3.length) % word2.length;
   int i1 = (i / (word2.length * word3.length)) % word1.length;
            
   String name = word1[i1] + word2[i2] + word3[i3];
   categories.add(category(name));
}
     
    
CategoryDAO categoryDAO = new CategoryDAOImpl(dataSource, dbAdapter, daoMonitor);
categoryDAO.insert(categories);
Now you have category filled in. In the same way you can fill book table, using categories you've already generated. This time, the example is shortened:
for(int i = 0; i < ...; i++) {
   Book b = new Book();
   b.setCategoryId(categories.get(i % categories.size()).getId());
   ...
}

 

Generating data is very useful when in comes to preparing the database for performance testing.

What is your approach?