Technical Lab: Schema Mapping, Transformation, Data Formats, Error Handling & RPA

This lab session brings together the practical technical skills required for the course exercises. It covers five interconnected topics: schema mapping between systems, data transformation using XSLT, XML Schema validation in Java, a comparison of data exchange formats, integration error handling basics, and Robotic Process Automation as a last-resort integration technique.

Part 1: Schema Mapping

What Is Schema Mapping?

A schema defines the structure and constraints of a data format — what fields exist, what types they are, which are required, and what values are valid.

Schema mapping is the process of defining how fields in one system's schema correspond to fields in another system's schema. It answers: "This field from System A should go into which field in System B, and does it need to be transformed?"

Why It's Necessary

Almost no two systems use the same data model, even for the same concept:

| System A (ERP) | System B (CRM) | Mapping | |----------------|----------------|---------| | customerNumber | accountId | Direct rename | | firstName + lastName | fullName | Concatenate | | dateOfBirth (DD/MM/YYYY) | birthDate (YYYY-MM-DD) | Format conversion | | countryCode (ISO 2) | countryName (full name) | Lookup / expand | | creditLimit (integer pence) | creditLimitDecimal (decimal pounds) | Divide by 100 | | — | createdAt | Default to current timestamp |

A complete schema mapping document captures every field, the transformation required, and what to do when the source value is null or invalid.

Schema Mapping Template

Source System:    SAP ERP
Target System:    Salesforce CRM
Entity:           Customer

Source Field        | Target Field         | Transformation        | Null handling
--------------------|----------------------|-----------------------|------------------
customerNumber      | Account.ExternalId   | Direct                | Reject (required)
firstName           | Account.FirstName    | Direct                | Default to ""
lastName            | Account.LastName     | Direct                | Reject (required)
firstName+lastName  | Account.Name         | Concatenate with " "  | Use lastName only
countryCode (ISO2)  | Account.Country      | Lookup → full name    | Default to "Unknown"
creditLimit (pence) | Account.CreditLimit  | Divide by 100         | Default to 0
-                   | Account.CreatedDate  | CurrentDateTime()     | N/A

Part 2: Altova MapForce

Altova MapForce is a visual data mapping and transformation tool. It provides a drag-and-drop interface for mapping between XML, JSON, CSV, databases, and EDI formats, and can generate transformation code in Java, C#, XSLT, or XQuery.

MapForce Workflow

Define source schema — import or create an XML Schema, JSON Schema, or database structure
Define target schema — import or create the target structure
Draw mappings — drag connections from source fields to target fields
Add transformation functions — concatenation, date conversion, string operations, conditionals
Test with sample data — preview the transformed output
Generate transformation code — export as XSLT, Java, or C#

Installing MapForce

Download from altova.com. The free evaluation edition is fully functional for 30 days — sufficient for the course exercise.

Exercise: Map Order Data

Given:

Source: XML order from legacy system (field names in German: Bestellnummer, Kundennummer, Betrag)
Target: JSON order for REST API (fields: orderId, customerId, amount)

Steps:

Create a source XML schema (XSD) defining the German fields
Create a target JSON schema defining the English fields
In MapForce, connect source to target and add field mappings
Add a function to convert the amount from EUR cents (integer) to EUR (decimal)
Generate XSLT and verify with a sample XML file

Part 3: XSLT Transformation in Java

XSLT (Extensible Stylesheet Language Transformations) is a language for transforming XML documents. It is widely used in integration for:

Converting XML from one schema to another
Extracting specific elements from a large XML document
Generating different output formats (XML, HTML, text) from the same XML

XSLT Basics

An XSLT stylesheet defines template rules. When an input XML node matches a template, the template's content is written to the output.

Input XML (legacy order):

XML

<?xml version="1.0" encoding="UTF-8"?>
<Bestellung>
  <Bestellnummer>ORD-1234</Bestellnummer>
  <Kundennummer>CUST-99</Kundennummer>
  <Betrag>29999</Betrag>
  <Waehrung>EUR</Waehrung>
</Bestellung>

XSLT stylesheet (legacy-to-canonical.xsl):

XML

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/Bestellung">
    <order>
      <orderId>
        <xsl:value-of select="Bestellnummer"/>
      </orderId>
      <customerId>
        <xsl:value-of select="Kundennummer"/>
      </customerId>
      <amount>
        <!-- Convert cents to decimal: divide by 100 -->
        <xsl:value-of select="Betrag div 100"/>
      </amount>
      <currency>
        <xsl:value-of select="Waehrung"/>
      </currency>
    </order>
  </xsl:template>

</xsl:stylesheet>

Output XML (canonical order):

XML

<order>
  <orderId>ORD-1234</orderId>
  <customerId>CUST-99</customerId>
  <amount>299.99</amount>
  <currency>EUR</currency>
</order>

Applying XSLT in Java

JAVA

import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import java.io.*;

public class XsltTransformer {

    public static String transform(String xmlInput, String xsltPath) throws Exception {
        TransformerFactory factory = TransformerFactory.newInstance();

        // Load the XSLT stylesheet
        Source xsltSource = new StreamSource(new File(xsltPath));
        Transformer transformer = factory.newTransformer(xsltSource);

        // Set up input and output
        Source xmlSource = new StreamSource(new StringReader(xmlInput));
        StringWriter output = new StringWriter();
        Result result = new StreamResult(output);

        // Apply the transformation
        transformer.transform(xmlSource, result);
        return output.toString();
    }

    public static void main(String[] args) throws Exception {
        String legacyXml = """
            <Bestellung>
              <Bestellnummer>ORD-1234</Bestellnummer>
              <Kundennummer>CUST-99</Kundennummer>
              <Betrag>29999</Betrag>
              <Waehrung>EUR</Waehrung>
            </Bestellung>
            """;

        String canonicalXml = transform(legacyXml, "legacy-to-canonical.xsl");
        System.out.println(canonicalXml);
    }
}

Part 4: XML Schema Validation in Java

XML Schema (XSD) defines the valid structure of an XML document. Validating incoming messages against a schema before processing them prevents malformed data from propagating to downstream systems.

Example XSD

XML

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:element name="order">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="orderId"    type="xs:string"         minOccurs="1"/>
        <xs:element name="customerId" type="xs:string"         minOccurs="1"/>
        <xs:element name="amount"     type="xs:decimal"        minOccurs="1"/>
        <xs:element name="currency"   type="currencyCodeType"  minOccurs="1"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <xs:simpleType name="currencyCodeType">
    <xs:restriction base="xs:string">
      <xs:pattern value="[A-Z]{3}"/>  <!-- ISO 4217: exactly 3 uppercase letters -->
    </xs:restriction>
  </xs:simpleType>

</xs:schema>

Validating in Java

JAVA

import javax.xml.validation.*;
import javax.xml.transform.stream.*;
import org.xml.sax.*;
import java.io.*;

public class XmlValidator {

    private final Schema schema;

    public XmlValidator(String xsdPath) throws Exception {
        SchemaFactory factory = SchemaFactory.newInstance(
            javax.xml.XMLConstants.W3C_XML_SCHEMA_NS_URI
        );
        this.schema = factory.newSchema(new File(xsdPath));
    }

    public ValidationResult validate(String xmlContent) {
        try {
            Validator validator = schema.newValidator();
            validator.validate(new StreamSource(new StringReader(xmlContent)));
            return ValidationResult.success();
        } catch (SAXException e) {
            return ValidationResult.failure(e.getMessage());
        } catch (IOException e) {
            return ValidationResult.failure("IO error: " + e.getMessage());
        }
    }

    // Usage in integration flow
    public static void main(String[] args) throws Exception {
        XmlValidator validator = new XmlValidator("order.xsd");

        String validXml = "<order><orderId>ORD-1</orderId><customerId>C1</customerId>"
            + "<amount>99.99</amount><currency>EUR</currency></order>";

        String invalidXml = "<order><orderId>ORD-1</orderId>"
            + "<amount>-5</amount><currency>eur</currency></order>";  // lowercase currency

        System.out.println(validator.validate(validXml));    // success
        System.out.println(validator.validate(invalidXml));  // failure: pattern violation
    }
}

Part 5: Data Exchange Format Comparison

A summary reference for the course:

| Format | Type | Human-readable | Schema support | When to use | |--------|------|----------------|----------------|-------------| | JSON | Text | Yes | JSON Schema (optional) | REST APIs, modern integration | | XML | Text | Yes | XSD (mature, widely used) | SOAP, legacy, healthcare, finance | | CSV | Text | Yes | None built-in | Bulk file transfers, exports | | Avro | Binary | No | Schema Registry (required) | High-throughput Kafka pipelines | | Protobuf | Binary | No | .proto files | gRPC, performance-critical | | EDI | Text (specialized) | Barely | Standards (X12, EDIFACT) | B2B trade, retail, logistics | | Parquet | Binary (columnar) | No | Implicit | Data warehouse/lake bulk loads |

Key decision factors:

Interoperability needed (external party) → JSON or XML (human-readable, universally supported)
Throughput > 100k messages/sec → Avro or Protobuf
Legacy system constraint → whatever the system supports (often XML or CSV)
B2B partner exchange → EDI (ANSI X12 or EDIFACT)
Analytical processing → Parquet or ORC

Part 6: Basics of Integration Error Handling

The Three Classes of Integration Error

Class 1: Infrastructure errors
The infrastructure failed — network down, broker unavailable, target service unreachable.
Strategy: retry with exponential backoff. These errors are transient and usually self-resolve.

Class 2: Data errors
The message itself is the problem — missing required field, invalid format, business rule violation.
Strategy: do not retry. Send to Dead Letter Queue and alert the data owner. Retrying will not fix a bad message.

Class 3: Processing errors
The integration code failed — null pointer exception, out of memory, unexpected data condition.
Strategy: fix the code bug, then replay affected messages from DLQ.

Error Handling Code Pattern (Java)

JAVA

public void processMessage(Message message) {
    String correlationId = message.getStringProperty("correlationId");
    try {
        validateMessage(message);          // throws DataValidationException
        OrderDto order = transform(message); // throws TransformException
        sendToWarehouse(order);             // throws IOException (transient)
        message.acknowledge();
        log.info("Processed successfully", correlationId);

    } catch (DataValidationException e) {
        // Data error: do NOT retry — send to DLQ immediately
        log.error("Data validation failed: {}", e.getMessage(), correlationId);
        deadLetterQueue.send(message, "VALIDATION_FAILED", e.getMessage());

    } catch (IOException e) {
        // Transient error: increment retry count, retry with backoff
        int retryCount = message.getIntProperty("retryCount") + 1;
        if (retryCount >= MAX_RETRIES) {
            log.error("Max retries exceeded", correlationId);
            deadLetterQueue.send(message, "MAX_RETRIES_EXCEEDED", e.getMessage());
        } else {
            message.setIntProperty("retryCount", retryCount);
            retryQueue.sendWithDelay(message, backoffDelayMs(retryCount));
        }
    }
}

private long backoffDelayMs(int attempt) {
    return (long) Math.pow(2, attempt) * 1000L;  // 2s, 4s, 8s, 16s...
}

Error Response Design

When an integration rejects a message, the error response must be informative:

JSON

{
  "errorCode": "VALIDATION_FAILED",
  "errorMessage": "Required field 'customerId' is missing",
  "correlationId": "abc-123",
  "timestamp": "2026-04-18T10:30:00Z",
  "field": "customerId",
  "receivedValue": null
}

The error response goes to the DLQ alongside the original message. The operations team uses it to diagnose and fix the root cause.

Part 7: Robotic Process Automation (RPA)

What Is RPA?

Robotic Process Automation (RPA) uses software robots to automate tasks that human workers perform through a user interface — clicking, typing, copying data from one screen to another.

RPA is used in integration when:

A system has no API and cannot be modified (legacy mainframes, old desktop applications)
The integration budget does not allow custom API development
The process is highly repetitive and rule-based

RPA robots interact with applications the same way a human would — through the UI:

Read data from one screen
Navigate to another screen or application
Enter data into form fields
Click buttons
Download and process files

How RPA Works

Architecture:

RPA Robot (software) → controls → User Interface of System A
                     → controls → User Interface of System B

The robot:

Identifies UI elements by their position, label, or accessibility properties
Interacts with them programmatically (click, type, select)
Can read screen content (OCR for scanned documents, UI element text)
Can handle exceptions (when a screen looks different from expected)

Major RPA Platforms

| Platform | Notes | |----------|-------| | UiPath | Market leader; enterprise-grade; large community | | Automation Anywhere | Cloud-native RPA; good AI capabilities | | Blue Prism | Strong in financial services and regulated industries | | Power Automate Desktop | Microsoft's RPA tool; included with some Microsoft 365 licences |

RPA Capabilities

UI automation: interact with Windows applications, web browsers, terminal emulators, and SAP GUI.

Document processing: extract data from PDFs, scanned invoices, and emails using OCR and AI document understanding.

Attended automation: a robot that assists a human user — triggered by the user, performs steps, hands control back.

Unattended automation: a robot that runs on a schedule or triggered by an event, with no human involvement.

Orchestrator: a central server that manages, schedules, and monitors robot runs.

When to Use RPA

RPA is appropriate when:

The target system has no API and cannot be modified
The process is stable and repetitive (the UI does not change frequently)
Volume justifies automation (e.g., 200+ manual entries per day)
A temporary solution is needed while a proper API integration is developed

RPA Limitations

Fragility: RPA robots break when the UI changes. A button renamed or moved can stop the robot. Maintenance cost is ongoing.

Performance: UI automation is much slower than API calls. An RPA robot might take 30 seconds to complete what an API call does in 100 milliseconds.

Error handling: robots that encounter unexpected UI states can get stuck or take wrong actions. Robust exception handling is critical.

Security: RPA robots need credentials to log into systems. These credentials must be securely managed (not hardcoded in scripts).

Not a substitute for APIs: RPA is a workaround, not a solution. Where an API is possible, build the API integration. Use RPA only when truly no API option exists or is justified.

RPA Integration Pattern

Trigger (schedule / event / manual)
  → RPA Robot activates
  → Log into System A (using stored credentials)
  → Extract required data (screen scraping / download)
  → Open System B
  → Input data into System B's form
  → Submit and verify
  → Log result (success / error)
  → Alert on failure

Lab Exercise Summary

This lab covers five technical areas assessed in the course exercises:

| Topic | Tool / Technology | Exercise | |-------|------------------|---------| | Schema mapping | Altova MapForce | Map order XML from legacy format to canonical JSON | | XSLT transformation | Java + XSLT | Transform German-field XML to English canonical XML | | XML Schema validation | Java + XSD | Validate incoming order XML; reject invalid messages | | Data format selection | Conceptual | Given a scenario, recommend the correct data format and protocol with justification | | RPA | Conceptual | Describe how RPA would be used to integrate a legacy system with no API; identify risks |

Work through each exercise sequentially. The MapForce and Java exercises require the relevant tools installed (see course overview for details).

Course complete. You have covered all 11 topic areas in the System Integrations FITech course. Review the course overview for research article topic selection and submission guidelines.

Technical Lab: Schema Mapping, Transformation, Data Formats, Error Handling & RPA

Part 1: Schema Mapping

What Is Schema Mapping?

Why It's Necessary

Schema Mapping Template

Part 2: Altova MapForce

MapForce Workflow

Installing MapForce

Exercise: Map Order Data

Part 3: XSLT Transformation in Java

XSLT Basics

Applying XSLT in Java

Part 4: XML Schema Validation in Java

Example XSD

Validating in Java

Part 5: Data Exchange Format Comparison

Part 6: Basics of Integration Error Handling

The Three Classes of Integration Error

Error Handling Code Pattern (Java)

Error Response Design

Part 7: Robotic Process Automation (RPA)

What Is RPA?

How RPA Works

Major RPA Platforms

RPA Capabilities

When to Use RPA

RPA Limitations

RPA Integration Pattern

Lab Exercise Summary

Enjoyed this article?

Leave a comment