5.5 Workflow and rules engines
Workflow engine
Section titled “Workflow engine”Lending has many long-running, branching, recoverable workflows:
- Application lifecycle.
- Sanction → documents → disbursement saga.
- Collection cycle per case.
- Restructuring approval workflow.
- Settlement cycles.
- Periodic KYC refresh.
- DLG invocation pipeline.
- Partner / vendor onboarding due diligence.
Building these as ad-hoc state machines in service code becomes unmaintainable past 5 – 10 flows. Use a dedicated workflow engine.
Options
Section titled “Options”| Engine | Strengths | Weaknesses |
|---|---|---|
Temporal (temporal.io) | Code-first; durable execution; great DX; rich SDKs (Java, Go, TypeScript); strong testing story | Operational complexity for self-host; SaaS cost |
| Camunda 8 (Zeebe) | BPMN visual modelling; good for business-analyst collaboration; mature | Workflow-as-XML feels heavyweight to engineers; learning curve |
| Camunda 7 | BPMN, embeddable inside JVM monolith; widely used in fintech; mature | Older paradigm; some EE features in commercial edition only |
| Flowable | BPMN engine; embeddable; Apache 2; lighter alternative to Camunda 7 | Smaller ecosystem |
| Apache Airflow | Scheduled DAGs; data engineering | Wrong fit for transactional workflows |
| Custom (state machine in code) | Simple at start | Becomes a maintenance burden past 5 – 10 flows |
Recommendation
Section titled “Recommendation”For a Java / Spring shop with mixed transactional and long-running workflows:
- MVP: Camunda 7 embeddable inside the modular monolith. Low ops; BPMN is acceptable; familiar to many Indian fintech teams.
- Alternative MVP: Temporal if the team is comfortable with code-first workflows and ops can support it.
- Production at scale: Temporal for new workflows if code-first culture takes hold; stay with Camunda for legacy flows.
What goes into the workflow engine
Section titled “What goes into the workflow engine”- Sanction-to-disbursement saga — multi-step, multi-module, compensating actions on failure.
- Application lifecycle — state machine with KYC waits, ingestion completion gates, decision waits.
- Co-lending booking handoff — with partner-side retries and SLA tracking.
- Settlement cycles — periodic, with reconciliation gates.
- Collection cycle per case — soft → hard → legal escalation.
- Restructuring approval with multi-level approvals and observation period.
- DLG invocation pipeline.
- Onboarding due diligence for partners / vendors.
- Periodic KYC refresh workflows.
What doesn’t go into the workflow engine
Section titled “What doesn’t go into the workflow engine”- Per-event handlers (kept in service code).
- Sub-second hot-path decisioning.
- High-frequency batch processing (use Spring Batch or simple cron-based services).
Rule engine
Section titled “Rule engine”Underwriting policy is rule-heavy and changes frequently. Encoding rules in service code locks domain users out and slows iteration.
Options
Section titled “Options”| Engine | Strengths | Weaknesses |
|---|---|---|
| Drools | Mature; rule-language (DRL); fact-based reasoning | Heavy; JVM-only; learning curve |
| JSON Logic / json-logic-java | Lightweight; rule-as-JSON; easy to serialise / version | Limited expressiveness for complex rules |
| OpenL Tablets | Decision tables in Excel format | Niche; Excel-centric |
| Spring Expression Language (SpEL) | Lightweight, JVM-native | Not a real rule engine; limited rule management |
| Custom DSL | Domain-tailored | Cost to build; maintenance |
| Decision tables in DB | Simplest; product team can edit | No version control built-in unless added |
Recommendation
Section titled “Recommendation”For a lending platform with 100 – 500 rules across products / partners / channels:
- MVP: Decision tables in DB + JSON Logic for rule expressions. Editable by ops via admin UI; version-controlled per row; sandbox via DB clone.
- Production: Drools if rule complexity grows substantially and a rule-engineer hire is feasible. Otherwise keep the JSON-Logic + decision-table approach with stronger tooling (sandbox, versioning, audit).
Rule lifecycle
Section titled “Rule lifecycle”- Domain user proposes a rule edit in admin console.
- System generates a draft rule version.
- Sandbox test against historical applications.
- Optional champion-challenger run comparing new vs current rule outcomes.
- Approve + set effective date.
- Production deploy (versioned; existing in-flight applications continue with the version they started with).
- Per-application execution records exact rule version used.
Rule trace
Section titled “Rule trace”Every decision_run captures the full rule trace — every rule evaluated, input, intermediate, output, contribution. Stored as JSON for:
- Audit (regulator / internal).
- Borrower explainability (subject to compliance review).
- Model performance monitoring (champion-challenger).
See the example trace in 6.13 End-to-end decision walkthrough.
A worked sanction-to-disbursement workflow
Section titled “A worked sanction-to-disbursement workflow”This combines workflow engine and rule engine in a typical platform flow.
[Workflow: SanctionToDisbursement] Trigger: sanction.issued event ↓ 1. Wait for kfs.acknowledged event (with timeout T1) → on timeout: nudge borrower + extend; max 3 nudges then expire sanction ↓ 2. Trigger document generation ↓ [Rule engine: select template version per product/partner/language] ↓ generates KFS, agreement, DPN, PG document IDs ↓ 3. Trigger eSign multi-signer flow ↓ (long-running; hours to days possible) → on signer failure: alternate vendor; max 2 retries then refer to ops ↓ 4. Trigger eStamp → if state unavailable: queue + retry; alternate vendor ↓ 5. Wait for mandate.active event (with timeout T2) → on timeout: borrower re-engage workflow ↓ 6. Pre-disbursement checklist run ↓ [Rule engine: per-product checklist] ↓ - if pass: proceed ↓ - if fail: route to ops queue (workflow waits); on resolution, retry ↓ 7. Trigger disbursement (sync to payout rail) → on failure: idempotent retry; if persistent, ops escalation ↓ 8. On utr.captured event: emit loan.activated ↓ 9. (Workflow complete)
Compensation on cancellation: - Revoke generated documents (mark void) - Cancel mandate with sponsor bank - Notify borrower - Release any held funds - Audit trail of cancellationThis shape (workflow orchestrating, rules deciding) repeats across every long-running flow.
A sample rule executed by the engine
Section titled “A sample rule executed by the engine”Rule: BUREAU_SCORE_MIN_650_PROMOTER
In JSON Logic:
{ "rule_id": "BUREAU_SCORE_MIN_650_PROMOTER", "rule_version": "v2.1", "expression": { ">=": [ { "var": "promoter.bureau_score.worst_of" }, 650 ] }, "outcomes": { "true": { "action": "PASS", "grade_contribution": null }, "false": { "action": "DECLINE", "reason_code": "BUREAU_SCORE_BELOW_THRESHOLD" } }, "metadata": { "owner": "credit-policy-team", "effective_from": "2026-04-01", "audit_link": "https://internal/policy/audit/BUREAU_SCORE_MIN_650" }}The engine evaluates against the application’s data inputs and records the outcome in the decision trace.
Operational concerns
Section titled “Operational concerns”- Workflow engine HA: if Temporal / Camunda is down, in-progress workflows pause (not data loss). Plan for restart procedures.
- Rule engine performance: rule execution must be
< 200mstypical for synchronous decision paths. Cache rule sets aggressively; preload at service startup. - Rule deployment safety: champion-challenger framework mandatory for high-impact policy changes. Roll back if production metrics deteriorate.
- Workflow version migration: in-flight workflows may use old versions of process definitions. Either migrate (carefully) or let them complete naturally.
- Idempotency in workflow activities: same rules as event consumers — every activity must be safe to retry.
Related
Section titled “Related”- 6. Underwriting — rule library content.
- 6.6 Sample rules library — 45 worked rules.
- 6.13 End-to-end decision walkthrough — full engine run.
- 5.8 Sequence diagrams — workflow patterns in diagram form.