History of the Question

Enterprises undergoing digital transformation often operate in complex "brownfield" environments where mission-critical core banking transactions continue to be processed by decades-old COBOL mainframes on IBM z/OS systems. Simultaneously, customer-facing onboarding and service flows are increasingly handled by modern React-based web portals and mobile applications. This technological divergence creates a significant validation challenge for QA teams, who must ensure seamless, error-free data flow and transactional consistency across these fundamentally disparate architectures.

Traditionally, automation efforts in such environments become heavily siloed, with specialized teams maintaining separate toolsets for mainframe terminal emulation (such as Jagacy or Extra!), generic UI automation (Selenium or Cypress), and API validation (Rest-Assured or Postman). This fragmentation leads to brittle integration suites written in highly technical jargon that non-technical business analysts cannot review or validate against requirements. Furthermore, catastrophic data integrity issues frequently emerge when a test fails mid-execution, potentially leaving a mainframe account created while the corresponding web portal verification remains incomplete, thereby polluting downstream environments with orphaned test data.

This specific question emerged from a Fortune 500 financial services firm struggling to validate a complex "new customer onboarding" workflow that spanned a mobile React application, a Kafka event bus, a Java microservice layer, and final account provisioning on an IBM z/OS mainframe. The organization required a unified automation strategy that could bridge these technical divides while maintaining the agility expected in modern DevOps pipelines. The challenge was further complicated by the need for business analysts to author and comprehend test scenarios without understanding the underlying technical implementations of each system.

The Problem

The core challenge lies in the fundamental impedance mismatch between synchronous web automation, which expects immediate DOM updates and event-driven interactions, and the block-mode terminal emulation of 3270 mainframes, which relies on explicit screen scraping and precise cursor positioning. RESTful APIs introduce further complexity by operating within a stateless request-response paradigm that lacks the session continuity inherent in terminal sessions. Bridging these architectural styles requires an abstraction layer capable of translating high-level business actions into system-specific commands without leaking technical implementation details into the test scenarios.

Maintaining a unified Domain-Specific Language (DSL) using tools like Gherkin becomes exceedingly difficult when the technical implementations of test steps diverge so wildly across systems. Web elements are typically identified using CSS selectors or XPath expressions, API validations rely on JSON path assertions and schema validation, while mainframe interactions depend on field coordinates, screen labels, or specific key sequences like F1 or Enter. Without a robust abstraction strategy, the DSL quickly becomes cluttered with technical locators and system-specific jargon, defeating its purpose as a communication medium between business and technical stakeholders.

Furthermore, guaranteeing true transactional integrity across these distributed systems requires implementing a Saga or Compensating Transaction pattern directly within the test framework architecture, which is non-trivial when the testing layer lacks native hooks into the mainframe's two-phase commit protocols or distributed transaction managers. When a test failure occurs in the web portal after a mainframe transaction has already been committed, the framework must possess the intelligence and capability to trigger explicit rollback procedures to restore environmental consistency. This necessitates sophisticated state tracking and error handling mechanisms that go far beyond standard try-catch blocks.

Finally, the automation framework must securely handle disparate authentication mechanisms without embedding sensitive credentials directly into test scripts. Web portals often utilize modern OAuth2 or SAML flows with Multi-Factor Authentication (MFA), REST APIs rely on API keys or JWT tokens, while legacy mainframes authenticate against RACF or ACF2 providers using static user profiles. A centralized, encrypted credential vault with environment-specific injection capabilities is essential to maintain security posture while enabling seamless cross-system authentication.

The Solution

To address these complexities, the framework should be architected using the Hexagonal Architecture (Ports and Adapters) pattern, which enforces strict separation between the test domain logic and external system interactions. Define an abstract ApplicationDriver port interface declaring high-level domain methods such as enterCustomerData(), verifyAccountCreation(), and rollbackTransaction(). This interface acts as the sole contract that your DSL layer (such as Cucumber Step Definitions or SpecFlow Bindings) is permitted to interact with, ensuring complete isolation from implementation specifics.

Concrete adapter implementations handle the system-specific technicalities: a SeleniumWebAdapter translates port methods into browser interactions, a RestAssuredAdapter executes HTTP calls and parses JSON responses, and a HllapiMainframeAdapter utilizes the High-Level Language API to send keys, read screen buffers, and validate field contents on the 3270 emulator. Each adapter encapsulates its own retry logic, explicit wait mechanisms, and error handling strategies appropriate to its technology stack. When an adapter successfully completes an action that modifies state, it publishes a domain event (such as AccountCreatedEvent) to a central TestEventBus rather than returning primitive data types.

For transactional integrity, implement a Test Saga Orchestrator that maintains an ordered log of all executed CompensableAction objects during a test scenario. If any step in the workflow fails with an exception, the orchestrator automatically executes the compensate() method of previously successful actions in reverse order, effectively running a compensating transaction to delete the mainframe account or void the API reservation. This pattern ensures that the test environment remains pristine even when tests fail mid-flight, preventing the accumulation of orphaned data that plagues traditional end-to-end suites.

State management across the heterogeneous stack is achieved by treating the TestContext as a first-class citizen, utilizing ThreadLocal<DomainContext> to store rich domain objects rather than primitive strings, thereby preventing tight coupling between test steps. The React adapter might populate a CustomerProfile object in the context, which the mainframe adapter subsequently retrieves to execute its portion of the workflow. This approach ensures that the DSL remains focused on business entities rather than technical identifiers like session IDs or screen coordinates.

To tie these components together, utilize a lightweight messaging bus such as Google Guava EventBus or a reactive stream to allow adapters to communicate state changes without direct method invocation, thereby decoupling the mainframe flow from the web validation flow. When the HllapiMainframeAdapter successfully creates an account, it publishes an event containing the account details, which the SeleniumWebAdapter consumes to automatically navigate to the appropriate verification screen. This event-driven approach within the test framework mirrors modern microservices architecture and significantly reduces maintenance overhead when individual system interfaces change.

// Port Interface Definition
public interface BankingDriver {
    void enterCustomerData(Customer customer);
    AccountDetails submitAccountCreation();
    void verifyAccountInPortal(AccountDetails account);
    void rollbackAccountCreation(AccountDetails account);
}

// Mainframe Adapter using HLLAPI
public class MainframeAdapter implements BankingDriver {
    private final HllapiWrapper hllapi;
    private final EventBus eventBus;
    
    @Override
    public AccountDetails submitAccountCreation() {
        hllapi.sendKey("@E"); // Simulate Enter key
        waitForScreen("Account Created");
        String accountId = hllapi.getTextByLabel("Account Number:");
        AccountDetails details = new AccountDetails(accountId);
        eventBus.post(new AccountCreatedEvent(details));
        return details;
    }
    
    @Override
    public void rollbackAccountCreation(AccountDetails account) {
        hllapi.sendKeys("DELETE " + account.getId());
        hllapi.sendKey("@E");
        verifyScreen("Deletion Confirmed");
    }
}

// Saga Orchestrator for Transactional Integrity
public class TestSagaOrchestrator {
    private final List<CompensableAction> executedActions = new ArrayList<>();
    
    public void execute(Runnable action, Runnable compensation) {
        try {
            action.run();
            executedActions.add(new CompensableAction(action, compensation));
        } catch (Exception e) {
            compensate();
            throw new TestFailureException(e);
        }
    }
    
    private void compensate() {
        Collections.reverse(executedActions);
        for (CompensableAction action : executedActions) {
            try {
                action.compensate();
            } catch (Exception ex) {
                publishToDeadLetterQueue(action, ex);
            }
        }
    }
}

Situation from Life

During a 2022 consulting engagement with a global insurance provider undergoing digital transformation, I encountered a critical "First Notice of Loss" (FNOL) business process that exemplified these exact challenges. The workflow required a policyholder to submit a claim via a React Native mobile application by uploading accident photos, which triggered a Python-based machine learning microservice for damage assessment and fraud detection, before finally updating a legacy Unisys mainframe system to allocate financial reserves and validate policy coverage. The existing automation strategy relied on three distinct, non-communicating suites: Cypress for the mobile app, Pytest for the API, and Jagacy for the mainframe terminal emulation.

The siloed approach necessitated manual correlation of claim numbers between teams using shared Excel spreadsheets, and environmental pollution became a severe blocker during regression cycles. The crisis moment occurred when a mobile network timeout caused a test to fail after the mainframe had already committed a $50,000 reserve allocation, leaving the financial data in an inconsistent state that required four hours of manual cleanup by a mainframe systems programmer. This incident directly violated the team's "clean environment" policy and blocked the CI/CD pipeline for an entire business day.

We evaluated three potential remediation strategies to prevent future occurrences. The first option involved writing post-test database cleanup scripts to manually reverse mainframe transactions, but this was rejected because security policies prohibited direct SQL access to the production-like UAT mainframe environment. The second approach proposed implementing a shared test data pool with pessimistic locking mechanisms to serialize test execution, but this would have increased the suite execution time from twenty minutes to over four hours, completely negating the benefits of parallelization in CI/CD. The third strategy, which we ultimately selected, involved implementing a Saga pattern within the test automation framework itself, mirroring the application's own eventual consistency model while preserving the ability to run hundreds of tests in parallel.

The implemented solution introduced a ClaimSaga orchestrator that intercepted every action performed by the mobile and mainframe adapters. When the mobile adapter threw a StaleElementReferenceException due to the network timeout, the saga immediately triggered the reverseReserveAllocation() compensating transaction on the mainframe adapter using the claim ID stored in the ThreadLocal context. This automatic rollback mechanism reduced environmental data pollution by ninety-eight percent and allowed the team to confidently run five hundred parallel threads in their Jenkins pipeline without fear of creating orphaned financial records.

This dramatic improvement in test reliability allowed the QA team to shift their focus from manual data cleanup to exploratory testing and edge case analysis. Business analysts could finally author and review test scenarios written in plain English, such as Given a policyholder reports a major accident, when photos are uploaded but the AI assessment service times out, then no financial reserve shall be allocated. This ensured that the automation suite served as accurate living documentation reflecting complex business rules across all three technological tiers.

What Candidates Often Miss

How do you handle session state persistence across the emulator and the web portal without creating tight coupling between the adapters?

Novice candidates frequently attempt to solve this by returning raw session identifiers or database primary keys directly from step definition methods, creating brittle dependencies where Step B cannot execute until Step A has explicitly returned a specific string value. This approach fundamentally breaks Domain-Driven Design principles and forces business-readable Gherkin steps to be ordered in a strictly technical sequence rather than a logical business flow. Furthermore, it leaks implementation details into the DSL layer, making tests fragile when technical identifiers change format.

The robust architectural solution implements a Scenario Context or Test Data Context that acts as a transient registry for the duration of the test execution, typically implemented using ThreadLocal<Map<Class<?>, Object>> to ensure thread safety during parallel execution. Adapters do not return primitive values to the DSL layer; instead, they publish strongly-typed domain events or objects into this context. For example, when the mainframe adapter successfully creates an account, it publishes an AccountCreatedEvent containing the full account entity, which the web adapter subsequently retrieves by listening to the event bus or querying the context.

This event-driven approach ensures that the DSL layer remains completely agnostic regarding the origin of data, whether the policy number was scraped from a green screen or returned in a JSON response. By depending on abstractions rather than concrete implementations, the framework adheres to the Dependency Inversion Principle. This allows individual adapters to be refactored or replaced without impacting the business-readable test scenarios, significantly reducing long-term maintenance costs.

What specific mechanism prevents a compensating transaction from itself failing, potentially leaving the system in an inconsistent state?

Many junior engineers overlook the critical failure mode where the compensation logic itself encounters an error. Such errors can include network timeouts while attempting to delete a mainframe record or validation failures because the record was already modified by a concurrent background process. This scenario results in "toxic data" accumulation where the original action succeeded but the rollback failed, leaving the test environment in a permanently corrupted state.

The solution requires implementing idempotent compensating actions that are designed to be safely retried multiple times without causing duplicate deletion errors. These should be coupled with a robust retry mechanism featuring exponential backoff and circuit breaker patterns to handle transient infrastructure failures gracefully. If all retry attempts are exhausted, the framework must publish the failed compensation details to a persistent Dead-Letter Queue (DLQ). This DLQ can be implemented as a database table or message topic containing full correlation IDs and stack traces.

Additionally, implement validation gates before attempting compensation to verify the current state of the downstream system. For instance, confirm that the mainframe account exists, has a zero balance, and lacks recent user modifications before issuing a deletion command. A nightly automated reconciliation job can then process the DLQ to handle these orphaned records manually, ensuring that the test environment self-heals and preventing critical regressions from being masked by existing data pollution.

Why is using coordinate-based screen scraping (HLLAPI) for mainframes considered a liability, and how do you abstract it to reduce maintenance overhead when screen layouts inevitably change?

Candidates frequently advocate for hardcoded row and column coordinates, such as getText(10, 45, 10) to read ten characters starting at row ten, column forty-five. They favor this approach because it appears precise and deterministic during initial test development. However, this strategy creates a severe maintenance burden because mainframe applications frequently undergo screen modifications where new fields are inserted, causing all subsequent coordinate offsets to shift and rendering entire test suites invalid without warning.

The robust architectural solution implements a Screen Object Model that maps logical field names (such as ACCOUNT_NUMBER_FIELD) to dynamic search criteria rather than static coordinates. It utilizes the mainframe emulator's Field Identification capabilities, available via HLLAPI functions like FindFieldPosition or SearchField, to locate fields by their associated labels (for example, searching for the text "Account Number:"). At runtime, the adapter searches the screen buffer for the label text and calculates the relative offset to the corresponding input field. When the screen layout changes, only the JSON configuration file mapping labels to offsets requires updating, leaving the compiled Java code untouched.

For even greater resilience, implement a Screen Hash or Checksum mechanism that captures a cryptographic hash of the unprotected field contents at the start of the interaction. If the hash does not match the expected baseline, the framework fails fast with a clear "Screen Mismatch" error rather than attempting to read data from incorrect positions. This prevents tests from proceeding with garbage data that would generate false negatives or false positives, and immediately alerts the automation team to screen changes requiring configuration updates.