Extract data from XML and expose it as a service

Debatosh Tripathy
4 min readNov 29, 2020

Simplified XML data processing using JavaScript and expose the data as a service without deploying any code!

Photo by Grant Durr on Unsplash

In this article, we will discuss how we could easily extract data from the XML file and expose it as a service using the IBM Cloudant view.

Prerequisites

  • An active IBM Cloudant account
  • Basic JavaScript knowledge
  • Knowledge in Java or any other language to work with the XML file

Introduction

We all have dealt with XML files and used numerous tools to parse XML and extract information out of it. However, the code to extract information could get complicated while dealing with XML with complex structures. In this article, we will discuss how we could use the IBM Cloudant view to simplify XML data extraction and also expose it as a service.

Let’s demonstrate this through an example!

Note: For this article, we would be using Java to convert the below XML file to JSON first and store it in the Cloudant database. You could use any other language of your choice to accomplish the same.

Sample XML File used for this tutorial: Click to download

Overall Process

  • Convert the XML to JSON and store it in the Cloudant database
  • Configure Cloudant view to extract data and expose it as a Service

Objective:

The attached XML file contains country names and their respective codes in multiple languages. We would extract the list of country names and codes in the English language and expose it as a service using Cloudant views.

Step-1: Convert the XML to JSON and store it in the Cloudant database

Step-i: Create a Java project using any IDE. e.g. Eclipse

Step-ii: Add the json jar and commons-io jar in the Java Build Path.

Step-iii: Create a Class “CreateJSON” and paste the code given below.

Note: To keep the tutorial simple, we would use the below java code to convert the XML to JSON format and store it in a File. Then we would copy the content of the file to create a database document. This could also be done programmatically.

package com.test;

import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStreamWriter;
import org.apache.commons.io.IOUtils;
import org.json.JSONObject;
import org.json.XML;

public class CreateJSON {
public static void main(String[] args) throws Exception {
FileInputStream fis = new FileInputStream("src/Country_List.xml");
String xmlStr = IOUtils.toString(fis,"UTF-8");
JSONObject jsonObject = XML.toJSONObject(xmlStr);
String data = jsonObject.toString();
String updatedData = data.replaceFirst("\"", "\"_id\":\"countrylist_id\",\"");
OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream("src/Country_List.txt"), "utf-8");
writer.write(updatedData);
if (writer != null){
writer.close();
}
System.out.println(">>> Finished");
}
}

Step-iv: Open your Cloudant account and create a database in the name of sample_db.

Step-v: Click on database sample_db > click on "New Doc" > copy the content of the Country_List.txt and paste it in the document and then click on “save” to create the document.

Step-2: Configure Cloudant view to extract data and expose it as a Service

Step-i: Create a view by clicking on the "New View" option on the plus symbol present in "All Documents" or "Design Documents" and let's give the Index name as "countries-list-us-en" and _design as "data".

Step-ii: Add the below JavaScript in the "Map function" textbox.

function (doc) {
if(doc._id == "countrylist_id") {
var length = doc.picklist.entry;
for(var i in length){
var desclength = length[i].description;
for(var j in desclength){
if(length[i].description[j].language == "en-US"){
emit(length[i].description[j].content,length[i].name);
}
}
}
}
}

Step-iii: Now click on the “Create Document and then Build Index” button. This would start the process of creating the view.

Step-iv: Once the view creation is completed, click on the view name and again click on the JSON option(highlighted in red) as shown in the below image.

Step-v: This would open a unique URL(as shown below) through which the data could be consumed. Here “key” represents the country name and “value” represents the corresponding two-digit code of that country.

Note: Similarly, other views could easily be created for different languages available in the attached XML file showing similar data by making minor changes in the javascript code like below.

function (doc) {
if(doc._id == "countrylist_id") {
var length = doc.picklist.entry;
for(var i in length){
var desclength = length[i].description;
for(var j in desclength){
// Changed language below to show data in Italian
if(length[i].description[j].language == "it-IT"){
emit(length[i].description[j].content,length[i].name);
}
}
}
}
}

Summary

As you saw in this article, with minimal coding we could easily extract data from XML and expose it as a service using IBM Cloudant’s inbuilt feature. The above mentioned Cloudant view, which has a unique URL can be accessed through a GET call and make it convenient for adopting applications to consume it.

--

--