Streamlining JSON transformations using Jolt

05 / May / 2025 by Joice V Joseph 0 comments

Introduction

Jolt is a JSON to JSON transformation library written in Java, which allows you to define or write a template that specifies a set of operations that are chained together to transform your input JSON to your desired JSON format. The idea is to transform the structure of your JSON data and not modifying values.
“Use Jolt to get most of the structure right, then write code to fix values”.

The problem Jolt solves

Suppose you are ingesting data from a source, a third party API for example, and you’re required to transform it to a format that suits your needs and adheres to your system in place. Usually, when faced with such a need, you’d create code that has a common interface for different sources, and for each of these sources you’d create a bunch of getters and setters and have some sort of control structure with a bunch of it-else checks to decide which code to call based on multiple different criteria. However this can get messy and cumbersome when introducing new sources. You’d be required to create new code for it, have it deployed and maybe a bunch of additional things with it.

Now the alternative to this whole process could be jolt. You could create JSON specification to transform the source data and have each of those stored in the database and based on the some differentiation factor in the request, you could leverage the same logic to apply transformation specific to the source. So instead of doing imperative transforms, we do it declarative way.

Few use cases where Jolt can excel:
1. Normalizing data into a consistent format, especially when sources provide data in varying JSON format.
2. Migrations from old structure to a new structure or vice versa. Can help make your data backward compatible and tackle schema evolution that might happen over time.
3. Can also play a role in ETL processes

Challenges and considerations:
1. For complex transformations, it can get challenging to express the transformations required clearly in a jolt spec. Readability and manageability of such transformation can be challenging.
2. Jolt has a somewhat steep learning curve, the syntax and operations can take time getting used to.
3. If your use case requires data validations then jolt might not be the tool you’re looking for. Its primary focus is transforming the structure of the data.

Getting to know Jolt

Refer to this documentation to understand the syntax and available operations in Jolt: Documentation Link

A simple example of how Jolt works:

INPUT JSON:

{
  "rating": {
     "primary": {
         "value": 3,
         "max": 2
      },
      "quality": {
          "value": 3
      }
   }
}

OUTPUT JSON:

{
    "Rating" : 3,
    "RatingRange":2,
    "SecondaryRatings" : {
         "quality" : {
            "Id" : "quality",
            "Value" : 3,
            "Range" : 5
          }
     },
    "Range" : 5
}

TRANSFORMATION SPEC:

[
  {
    "operation": "shift",
    "spec": { 
        "rating": {
          "primary": {
             // simple match. Put the value '4' in the output under the "Rating" field
             "value": "Rating",
             "max": "RatingRange"
           },
          // match any children of "rating"
          // Shiftr has a precendence order when matching, so the "*" will match "last".
          // In this case anything that isn't "primary".
          "*": {
            // &1 means, go up one level and grab that value and substitute it in
            // in this example &1 = "quality"
            "max": "SecondaryRatings.&1.Range",
            "value": "SecondaryRatings.&1.Value",
            // We want "quality" to be a value field in the output under
            // "SecondaryRatings.quality.Id", but "quality" is an input key not an input value.
            // The "$" operator means use the input key, instead of the input value as ouput
            "$": "SecondaryRatings.&1.Id"
           }
         }
       }
     },
   {
    "operation": "default",
    "spec": {
       "Range": 5,
       "SecondaryRatings": {
          "*": {
           // Defaut all "SecondaryRatings" to have a Range of 5
           "Range": 5
       }
     }
   }
  }
]

Implementation in Java

This is the approach that you could follow:

  1. Create transformation specs for different input sources. Save them in your database (mongoDB for example).
  2. In your API, have some way to identify the original source and use that to pick your transformation spec. This way the same generic code at the service level could be applied to multiple input sources. This approach worked for my use case, however you may follow a different approach based on your needs.Defining entity to store the Jolt Transformation Specifications.
@Document(collection = "jolt_transformation_spec")
@Getter
@Setter
@ToString
@AllArgsConstructor
public class JoltTransformationSpec extends MongoAuditModel {
  ContentType contentType;
  String languageCode;
  Object specs;
}

Defining API endpoint, and method that let’s you save Jolt Transformation Specifications for different content types.

@PostMapping("/transform-spec")
@ApiOperation(value = "Save transform spec for MOVIE, LIVE_TV, WEB_SERIES")
public ResponseDTO<String> saveSpec( @RequestParam("contentType") String contentType,
                                     @RequestParam("languageCode") String languageCode,
                                    @RequestBody List<Object> inputJsonSpec){
    String response = contentIngestionService.saveSpec(ContentType.valueOf(contentType), languageCode, inputJsonSpec);
    return new ResponseDTO<>(Boolean.TRUE,"Request processed.", response);
}

@Override
public String saveSpec(ContentType contentType, String languageCode, Object inputJsonSpec) {
    if(Objects.nonNull(joltSpecDao.findByContentTypeAndLanguageCode(contentType,languageCode))){
        return "Conflicting spec found for the content type and language code.";
     }
     JoltTransformationSpec joltTransformationSpec = new JoltTransformationSpec(contentType,languageCode,inputJsonSpec);
     JoltTransformationSpec savedSpec = joltSpecDao.save(joltTransformationSpec);
     return "Saved Spec Id:" + savedSpec.getId();
}

Defining an API endpoint that let’s you upload the JSON file that requires transformation.

@PostMapping("/ingest")
@ApiOperation(value = "Imports a JSON Array file for given content type(MOVIE, LIVE_TV, WEB_SERIES) and language code")
public ResponseDTO<Object> ingestContent(@RequestParam("contentType") String contentType,
                                         @RequestParam("languageCode") String languageCode,
                                         @RequestParam("jsonFile") MultipartFile jsonFile) throws IOException {
    contentIngestionService.ingestContent(ContentType.valueOf(contentType),languageCode,jsonFile);
    return new ResponseDTO<>(true, "Request Processed.", null);
}

Defining the method to process the content and transform the uploaded file.
This method allows us to ingest the file containing the array of JSON objects that require transformation, applying Jolt Transformation to each of the objects, and convert them to desired entity (Content in this example), and save it to database.

Working of the code:

  • We fetch a JOLT transformation spec based on content type and language.
  • Parse the uploaded JSON file as a stream to handle large files efficiently.
  • Iterates through each object in the JSON array:
  • Converts it to a generic object.
  • Applies the JOLT transformation to reshape the JSON.
  • Maps the transformed data to a Content entity.
  • Collects all transformed entities in a list.
  • Saves the list of Content entities to the database in bulk.
/*
* Ingests an array of JSON objects by converting them into appropriate entities
* through the usage of a Jolt transformation spec.
*
* @param contentType The type of content provided in the input JSON that needs conversion.
* @param languageCode The language code for which the specs are saved.
* @param jsonFile The file containing the JSON objects to be ingested.
* @throws IOException if an I/O error occurs during file handling.
*/
@Override
public void ingestContent(ContentType contentType, String languageCode, MultipartFile jsonFile) throws IOException {
    log.info("Content ingestion started");

    // Extract the transformations needed from the spec
    JoltTransformationSpec transformationSpecDocument = joltSpecDao.findByContentTypeAndLanguageCode(contentType,languageCode);
    List transformationSpec = (List) transformationSpecDocument.getSpecs();
    Chainr chainr = Chainr.fromSpec(transformationSpec);
    // converting uploaded file to input stream to store it in memory
    InputStream inputStream = new BufferedInputStream(jsonFile.getInputStream());
    JsonFactory jsonFactory = new JsonFactory();
    JsonParser jsonParser = jsonFactory.createParser(inputStream);
    List<Content> contentToBeSaved = new ArrayList<>();

    /*
     * Iterate through each individual object in JSON, 
     * convert byte array to object, apply transformations,
     * add the converted entity to a list
    */
    if(jsonParser.nextToken() == JsonToken.START_ARRAY){
       while(jsonParser.nextToken() != JsonToken.END_ARRAY) {
       byte[] jsonBytes = objectMapper.writeValueAsBytes(objectMapper.readTree(jsonParser));
       InputStream individualInputStream = new ByteArrayInputStream(jsonBytes);
       Object inputObject = JsonUtils.jsonToObject(individualInputStream);
       Object transformedOutput = chainr.transform(inputObject);
       Content transformedContent = objectMapper.convertValue(transformedOutput, Content.class);
       log.info("Successfully transformed Content Id: {}",transformedContent.getId());
       contentToBeSaved.add(transformedContent);
       }
     } else {
         throw new IOException("File handling error") 
     }

     contentDAO.save(contentToBeSaved);
     log.info("Finished ingesting content (count: {}).", contentToBeSaved.size());
}

Conclusion:

Jolt excels at converting JSON data from one format to another without the need for custom code, ensuring consistency and efficiency in data processing. While it can seem challenging and complex at first, but its flexibility allows for a wide range of use cases. By understanding its limitations and where it excels it could be a valuable tool for transformations.

FOUND THIS USEFUL? SHARE IT

Tag -

Java Jolt

Leave a Reply

Your email address will not be published. Required fields are marked *