Batch Processing In Grails

09 / Sep / 2009 by Imran Mir 6 comments

In one of my project assignments I needed to insert large number of records into the database. I had to read the objects from an external source. Once I read all of the objects into a List, I iterated the list to save each one of them individually. In the beginning the process carried on fine but as the time passed the execution slowed down considerably. It almost took one second to insert one object into the database. Imagine the time it would have taken to insert 50000 records with this pace. Besides, many times it threw OutOfMemoryException. The code I had written did something like this :

 (0..60000).each{
           Person person = new Person(.....)
           person.save()
       }

One of the solutions that I found was to use transactions and save the objects in batches, each transaction saving a batch of objects. It worked well and reduced the execution time considerably. What I did was like this :

   def startTime = System.nanoTime()
        List  batch =[]
        (0..50000).each{
           Person person= new Person(....)
            batch.add(person)
            println "Created:::::"+it
            if(batch.size()>1000){
                Person.withTransaction{
                    for(Person p in batch){
                        p.save()
                    }
                }
            }
          batch.clear()
          session = sessionFactory.getCurrentSession()
          session.clear()
        }
        def endTime =  System.nanoTime()
        def diff = (startTime-endTime)/1000000000
        println "TIME TAKEN IS :::"+diff

In the previous case the time take to save 50,000 records was around 500 seconds. But, here time taken to save the same number of records came out to be just 80 seconds.

But there is one flaw in the method. If the objects are bulky, even this method would not work. Each action in a Grails Controller is executed within a Hibernate Session. The session is started right before the action starts and is closed once it returns. Thus Hibernate caches all the newly inserted Person instances in the session-level cache. As the number of objects grows, the session becomes bulkier, which slows down whole process .That also explains the reason for the memory issue, OutOfMemoryException, because all the objects are being cached to the Hibernate session.The solution to this problem is to clear the session regularly so as to keep it light throughout the process. All that needs to be done is to get hold of the current session and clear it after each batch has been written to the database. To do this just inject SessionFactory object into your controller, get the current session object and then clear this current session.

        def startTime = System.nanoTime()
        List  batch =[]
        (0..50000).each{
           Person person= new Person(....)
            batch.add(person)
            println "Created:::::"+it
            if(batch.size()>1000){
                Person.withTransaction{
                    for(Person p in batch){
                        p.save()
                    }
                }
                batch.clear()
            }
          session = sessionFactory.getCurrentSession()
          session.clear()             
        }
        def endTime =  System.nanoTime() 
        def diff = (startTime-endTime)/1000000000
        println "TIME TAKEN IS :::"+diff

Thank you,
Imran Mir,
imran@intelligrape.com

FOUND THIS USEFUL? SHARE IT

comments (6)

  1. Vivek Krishna

    Hi Srinath,

    You need to inject sessionFactory bean into the artefact using “def sessionFactory”

    — Vivek

    Reply
  2. srinath

    Hi,
    How to get sessionFactory .

    I was getting below issue
    groovy.lang.MissingPropertyException: No such property: sessionFactory for class:

    do i need to import any jars?

    thanks.

    Reply
  3. Lee Butts

    Hi,

    your code above is clearing the Hibernate session after every Person is created, shouldn’t it be inside the if(batch.size()>1000)… block?

    cheers

    Lee

    Reply

Leave a Reply to ian fang Cancel reply

Your email address will not be published. Required fields are marked *