Creating custom game development tools: Improving slot game workflow efficiency

2026-04-22

Reading time 5:48 min

For most people, the idea of improving game workflow sounds abstract. For me, it began as a very concrete frustration. I was watching skilled QA engineers spend entire days replaying the same slot scenarios, while developers waited in uncertainty, and releases slowed for reasons that had nothing to do with game quality or technical competence. At Yggdrasil Gaming and later at SpinPlay Games, I saw manual regression become the single most damaging element in an otherwise capable pipeline. Each subsequent game release increased the fragility and prolonged the duration of the process in question.

The main problem was not the testing process but rather the way testing was organized. Our tools could not see the real situation in the game engine. They observed pixels, not logic. They guessed rather than verified. I understood that if we wanted to scale production without sacrificing reliability, we needed a system that could operate inside the game, not around it. That realization shaped everything that followed.

What our team built was a controlled automation framework that interacted directly with the PixiJS engine, exposed real-time game state, and allowed us to simulate full player flows with precision. This was not about automation for speed alone. It was about creating an environment where every developer and tester could objectively understand game behavior, validate it, and trust the result without manual interpretation.

Understanding the real bottleneck in slot pipelines

Manual regression was destroying cycle time. Every new commit demanded full smoke and regression testing across predictable flows: base spins, feature triggers, bonus entry, free spins sequences, jackpot verification, and win confirmation. These were not edge cases; they were routine checks repeated endlessly. As the portfolio grew into dozens of active slots, this routine no longer scaled. QA became trapped in repetition, and developers were stuck waiting for validation that often took one or two full working days.

This delay had strategic consequences. Releases slipped, confidence weakened, and teams began compensating with rushed fixes close to deployment windows. I realized that unless this part of the pipeline was rebuilt, no amount of visual polish or engine optimization would protect production stability.

Core architecture of the test runner

The test runner was built as a controlled execution system, not a loose collection of scripts. Its foundation was a TypeScript core library that functioned as the single source of truth for test execution. This core managed environment initialization, browser orchestration, PixiJS access, method resolution, logging logic, reporting flow, and failure handling.

We separated this system into a strict core layer and a project layer. The core contained BaseTest, CoreMethods, the Pixi bridge, Puppeteer setup, and integration logic for Mocha, Allure, and Jira. The project layer contained only what was specific to each game: scenario definitions and any custom behavior required by particular slot mechanics. This ensured that structural complexity never leaked into scenario design and that the core remained reusable across teams and titles.

Each test execution followed a deterministic lifecycle. The environment booted, the game loaded, Pixi readiness was verified, frame state stabilized, scenario logic executed, object state validated, and finally reports were generated with complete contextual information. This removed timing uncertainty and reduced random failures that often plague traditional front-end automation.

BaseTest as the execution authority

BaseTest was designed as a single entry point that controlled every aspect of test execution. It handled environment setup, Puppeteer session creation, navigation, game loading, Pixi application exposure, frame stabilisation and teardown. Before any scenario ran, BaseTest ensured that the game had fully loaded, that all essential containers were present, and that the rendering loop was stable.

For QA engineers, the experience was intentionally simple. They extended BaseTest and expressed logic through high-level method calls such as spin(), changeBet(), openPopup() or getText(). The complexity of timing control, error recovery and environment stabilization remained hidden and centralized. This allowed QA to focus on scenario logic rather than on technical mechanics.

This design decision significantly changed test authoring. Instead of writing fragile procedural scripts, QA engineers defined structured behavioral flows that remained stable even as the UI evolved.

PixiJS internals and object resolution

The most critical technical layer was the Pixi bridge. Instead of interacting with the rendered canvas, the system accessed the window. PIXI_APP directly, traversing the entire display tree recursively. This allowed the tool to identify any object by logical name, hierarchy placement, or metadata signature.

Through this, we could resolve objects such as balance fields, win amount labels, spin buttons, and feature popups with absolute precision. Assertions were based on actual object properties like .text, .visible, .alpha, and .interactive. This replaced unstable screenshot checks with direct state verification.

For example, rather than visually confirming a win figure, we extracted the value directly from the PIXI.Text object that represented that label. This ensured accuracy irrespective of resolution, scaling, or layout changes. Similarly, button interaction was tied to object identity, not screen position, allowing tests to remain valid even when UI layouts shifted.

Precise method flows and execution logic

Every interaction was structured around logic integrity. When triggering a spin, the system verified the presence of the spin button, injected the event directly into the Pixi object, monitored state changes for spin start confirmation, tracked reel progression, awaited reel stop signals, and only then unlocked the frame for assertions.

Retrieving text followed a defined process: resolve the object through the Pixi tree, confirm its class type, extract its internal value, and forward it with full confidence to the assertion layer. This eliminated approximation entirely and ensured that every validation step reflected the true internal game state.

The same logic applies for detecting popups, verifying feature activation, and confirming object destruction. The system could confirm whether a pop-up had become invisible or whether it had been fully removed from the render tree, allowing us to assert correct lifecycle behavior rather than rely on subjective observation.

Draw call tracking and render behaviour analysis

To ensure visual correctness and performance stability, the system also monitored render behavior in real time. By observing draw call patterns and frame state metrics, we could detect irregular loops, performance degradation or visual object persistence issues. This became especially valuable during complex bonus rounds and heavy visual sequences where performance risks are higher.

This allowed us to identify performance regressions early, long before they became player-facing problems. It was not just about correctness but about maintaining high runtime quality under real load conditions.

CI mechanics and runtime orchestration

The integration with GitLab CI transformed the system from a local tool into a production decision engine. On every commit, the pipeline launched parallel Puppeteer instances, each assigned to specific test groups, ensuring fast execution without compromising coverage depth.

All test runs generated structured Allure reports that documented state behavior, failure points, and execution context. When a failure occurred, the Jira integration automatically created or updated task records with detailed information, eliminating manual reporting delays and uncertainty. This automated chain ensured that defects were documented, traceable, and reproducible without ambiguity.

This created a stable operational rhythm where code changes immediately entered a verification cycle and surfaced actionable feedback in minutes rather than days.

Cultural and leadership transformation

From a leadership standpoint, the impact extended far beyond automation. Developers assumed responsibility for testability, implementing Pixi hooks and verifying logic before committing changes. QA engineers moved away from manual execution into scenario designers shaping test logic using shared system vocabulary. Communication became structured, test-driven, and data-based rather than assumption-driven.

This replaced the reactive loop of development and testing with a proactive system of shared accountability. The workflow no longer followed a pattern of Dev → QA → Dev → recheck. Instead, quality became embedded within development itself, changing the production mindset across teams I worked with.

Closing perspective

Before these tools existed, release decisions relied heavily on visual judgment and instinct. Even experienced teams were forced to infer internal behaviour from external signals. That is not a people problem; it is a systems problem.

By exposing real engine state and validating logic directly, we replaced assumptions with evidence. Quality stopped being something teams felt and became something the system could prove. Release conversations shifted from confidence-based discussions to fact-based decisions grounded in observable behaviour.

For studios facing similar scaling pressures, a few practical lessons are clear:

1. Automate game logic, not visuals.

Pixel-level validation will always be fragile in state-driven games.

2. Centralise execution control.

Deterministic test lifecycles remove timing uncertainty and random failures.

3. Design for testability at the engine level.

Hooks and state access should be part of feature development, not added later.

4. Treat automation as a leadership tool, not a QA aid.

The right systems change behaviour, ownership, and accountability across teams.

The goal was never to build a complex automation framework. It was to restore trust in the production process between developers and QA, and between engineering and release management. When teams can see what the game is actually doing internally, growth no longer comes at the cost of control. It becomes deliberate, repeatable, and sustainable.