#Hazelcast : Enhance throughput of Java apps performance by optimization of object deserialization and distributed query

Hazelcast is a popular solution for In-Memory Data Grid that is often used with Databases in order to improve performance of applications, to distribute data across servers, clusters and geographies and to manage very large data sets. Here’s a quick reference to integrate hazelcast with grails .

Since Hazelcast is a distributed system, we need to put serialized data into hazelcast. See http://docs.hazelcast.org/docs/latest/manual/html/serialization.html for more information about object serialization & deserialization. Deserialization of objects is expensive. Therefore you want to avoid it or optimize it as much as possible while working with Hazelcast .
For example:

(1) Use IMap.delete() instead of IMap.remove(), and IMap.set() instead of IMap.put(). These alternate methods don’t return a value and therefore cause less deserialization.

(2) Choose an efficient Serialization method.
>> When Hazelcast serializes an object into Data:

It first checks whether the object is an instance of com.hazelcast.nio.serialization.DataSerializable,
If above fails, then it checks if it is an instance of com.hazelcast.nio.serialization.Portable,
If above fails, then it checks whether the object is of a well-known type like String, Long, Integer, etc. and user specified types like ByteArraySerializer or StreamSerializer,
If above checks fail, Hazelcast will use Java serialization.

>> Example of class implementing com.hazelcast.nio.serialization.DataSerializable.

[sourcecode language=”java”]public class Address implements DataSerializable {
private int zipCode;
private String city;
public Address() {}
//getters setters..
public void writeData( ObjectDataOutput out ) throws IOException {
out.writeInt(zipCode);
out.writeUTF(city);
}
public void readData( ObjectDataInput in ) throws IOException {
zipCode = in.readInt();
city = in.readUTF();
}
}[/sourcecode]

(3) Use Distributed query instead of bring entire entrySet and iterating over all keys or all values in an IMap locally.

As of Hazelcast 3.0, hazelcast supports distributed query feature compiled with Oracle JDK 6.
See http://hazelcast.org/features/#query for more information.

Hazelcast offers two API’s for distributed query purpose,–
i) Criteria API (e.g : com.hazelcast.query.Predicates, com.hazelcast.query.PredicateBuilder)
ii) Distributed SQL query (e.g : com.hazelcast.query.SqlPredicate)

>> How distributed query Works

Requested predicate is sent to each member in the cluster.
Each member looks at its own local entries and filters them according to the predicate. At this stage, key/value pairs of the entries are deserialized and then passed to the predicate.
Then the predicate requester merges all the results come from each member into a single set.

>> Below is an example of distributed query using Predicates class,–
Q. Find all addresses from city Kolkata except having zip code 700050,700052 , 700001 and all addresses from city Delhi and Mumbai. (here zip codes are not unique in all countries)
Sol.

[sourcecode language=”java”]Predicate allKolAddrs = Predicates.eq(‘city’, ‘Kolkata’)
Predicate reqZipCodes = Predicates.not(Predicates.in(“zipCode”,700050,700052,700001);
Predicate allDelhiMumbaiAddrs = Predicates.in(‘city’, “Delhi”,”Mumbai”);
Predicate reqPredicate = Predicates.or(Predicates.and( allKolAddrs, reqZipCodes ),allDelhiMumbaiAddrs);
addressMap.values(reqPredicate)[/sourcecode]

// desired result

Predicates can also be applied to keySet, entrySet and localKeySet of Hazelcast distributed
map.
*Note : for query execution each node / member should have the respective class and its all dependencies in their class path(here Address.class)

(4) Use Index

You can increase performance of these queries using indexes , but only if you do more queries than put/get operations .Use the below sample code in Hazelcast.xml.

<map name=”address”>
…
<indexes>
<index ordered=”true”>zipCode</index>
</indexes>
</map>

(5) Specify in-memory format

By default, Hazelcast stores data into memory in binary (serialized) format. But sometimes, it can be efficient to store the entries in their object form, especially in cases of local processing like entry processor and queries .Use the below sample code in Hazelcast.xml.

<map name="address"> ... <in-memory-format>OBJECT </in-memory-format> // BINARY (default)..... </map>

In above case the data will be stored in deserialized form

Hope this will help for working with Hazelcast.

For more reference : http://hazelcast.org/mastering-hazelcast/

Tag

Leave a Reply Cancel reply

Tips for writing a blog

Learn how to write a caption