Skip to content

5.1 High-level architecture

The platform stack from channel UIs down to data:

┌──────────────────────────────────────────────────────────┐
│ CHANNEL UIs │
│ Borrower web/app | DSA portal | CA portal | Partner │
│ Anchor portal | Admin console | Field-agent app │
└─────────┬────────────────────────────────────────────────┘
│ HTTPS, OAuth2, MFA
┌─────────▼────────────────────────────────────────────────┐
│ EDGE │
│ CDN | WAF | API Gateway | Rate limiter | Request log │
└─────────┬────────────────────────────────────────────────┘
┌─────────▼────────────────────────────────────────────────┐
│ BFF / GraphQL or REST for channel UIs │
│ Per-channel BFFs for borrower / partner / admin │
└─────────┬────────────────────────────────────────────────┘
┌─────────▼────────────────────────────────────────────────┐
│ DOMAIN SERVICES (modular monolith) │
│ Acquisition | Application | KYC | Ingestion | │
│ Decisioning | Manual review | Docs | Disbursement | │
│ LMS | Collections | Monitoring | Accounting | │
│ Reporting | Admin | Notification | Co-lending | │
│ Settlement │
└─────────┬────────────────────────────────────────────────┘
│ REST + async events
┌─────────▼────────────────────────────────────────────────┐
│ INTEGRATION LAYER (vendor adapters) │
│ Bureau | KYC | AA | BSA | GST | Tally | eSign | │
│ eStamp | NACH | UPI | PG | Payout | Sponsor bank | │
│ SMS / WA / Email / IVR / Dialer | Field-app push │
└─────────┬────────────────────────────────────────────────┘
┌─────────▼────────────────────────────────────────────────┐
│ DATA LAYER │
│ PostgreSQL (OLTP) | Redis (cache) | Object store (S3) │
│ Kafka / RabbitMQ (events) | OpenSearch | Warehouse + │
│ dbt | Vault / KMS │
└──────────────────────────────────────────────────────────┘

Each layer has one job. Each layer can be scaled or replaced independently.

  • Cloud: AWS Mumbai (ap-south-1) primary across 2 – 3 AZs; AWS Hyderabad (ap-south-2) DR with continuous replication.
  • Compute: Kubernetes (EKS) for stateless services. Aurora PostgreSQL Multi-AZ for OLTP. ElastiCache (Redis) Multi-AZ.
  • Event bus: RabbitMQ at MVP, MSK (managed Kafka) at scale (see 5.4).
  • Object storage: S3 with versioning + Object Lock on evidence buckets.
  • Search: OpenSearch for log indexing + free-text search of cases / documents.
  • Warehouse: Snowflake or ClickHouse at scale; PostgreSQL replica at MVP (see 5.6).
  • CDN: CloudFront / Cloudflare for portals and static assets.
  • Observability: Datadog OR Prometheus + Grafana + Loki + Tempo.
  • Secrets: AWS Secrets Manager + KMS, or HashiCorp Vault.

Alternative clouds (Azure Pune/Mumbai, OCI Mumbai/Hyderabad, GCP Mumbai) are valid; the architectural shape is portable.

The RBI IT MD (2.13) expects geographic separation between primary and DR.

  • Primary: ap-south-1 across 2 – 3 AZs.
  • DR: ap-south-2 (Hyderabad).
  • RPO: <= 15 minutes for OLTP (continuous CDC + cross-region replication).
  • RTO: <= 4 hours for full system recovery.
  • DR drill: quarterly for critical systems; annual end-to-end.
  • Runbook: documented per failure scenario; tested in drills.

Flow A — Borrower applies via own portal

Section titled “Flow A — Borrower applies via own portal”
  1. Borrower lands on web/app → mobile-OTP authentication.
  2. Borrower UI calls Borrower-BFF.
  3. BFF calls Application service → creates Application (status draft).
  4. Borrower fills KYC, gives AA + GST + bureau consents; Ingestion pulls data in parallel.
  5. Borrower submits → Decisioning runs full engine → returns APPROVE / DECLINE / REFER.
  6. On APPROVE: Sanction created → Docs service generates KFS + agreement.
  7. Borrower acknowledges KFS, eSigns multi-signer; eStamp issued.
  8. NACH mandate activates (eNACH via Aadhaar).
  9. Pre-disbursement checklist runs → Disbursement service triggers payout via sponsor-bank API → UTR captured.
  10. LMS activates loan account → daily accrual begins → classification job runs end-of-day.

See 5.8 Sequence diagrams for the full ladder.

  1. DSA logs into DSA portal (SSO via internal IDP).
  2. DSA-BFF wraps the same Application service but with DSA-attribution metadata captured upfront.
  3. Borrower receives consent links via SMS / WhatsApp; completes KYC / AA from their own phone.
  4. Remainder same as Flow A.
  5. On disbursement, DSA payout accrual is created and visible on the DSA portal.

Flow C — Co-lent loan (CLM-1, single partner)

Section titled “Flow C — Co-lent loan (CLM-1, single partner)”
  1. Application + decision as Flow A; decision engine runs both originator’s and partner’s policy in parallel.
  2. On dual approve, Co-lending Allocation service splits the loan (80:20 partner:NBFC by default).
  3. Sanction reflects both lenders; KFS discloses both.
  4. Disbursement coordinates with sponsor-bank escrow: partner funds released → originator funds released → combined amount to borrower with single UTR.
  5. LMS books two share-level ledgers + one consolidated borrower ledger.
  6. Daily / weekly settlement service moves funds from escrow to each lender per agreement.
  7. NPA classification (when triggered) updates both lenders in lockstep.

See 7. Co-lending deep dive and 5.8 Sequence diagrams for full mechanics.

  1. End-of-day, LMS generates the NACH presentation file for tomorrow’s due dates.
  2. NACH adapter pushes file to sponsor bank via SFTP at the cut-off.
  3. Sponsor bank submits to NPCI; NPCI processes overnight.
  4. Sponsor bank returns ack file next day with success / bounce per row.
  5. LMS processes each result:
    • Success: repayment recorded → waterfall allocation (penal → fees → interest → principal) → events emitted.
    • Bounce: bounce fee applied → re-presentation per NACH rules → if persistent, case enters collections queue.
  6. Reconciliation engine matches sponsor-bank statement against expected; exceptions queue.
  • Public internet ↔ edge: WAF + rate limit + mTLS for partner APIs.
  • Edge ↔ internal services: internal network; mTLS recommended for sensitive paths.
  • Services ↔ external vendors: outbound API gateway / egress proxy with allowlist; centralised egress logging.
  • Service ↔ database: IAM-authenticated; least-privilege; no shared admin credentials.
  • Stateful: Multi-AZ (Aurora, ElastiCache).
  • Stateless: autoscaling; minimum N+1 capacity.
  • Circuit breakers on every vendor call (Resilience4j).
  • Bulkheads: separate thread pools / connection pools per critical external system to prevent cascading failure.
  • Idempotency keys on every mutating external call.
  • Outbox pattern for at-least-once event publishing.

Per RBI Payment System Data Storage direction, AA framework, KYC MD, and IT MD:

  • All borrower data physically in Indian regions.
  • Cross-region backups only to the Indian DR region.
  • No vendor that requires data egress out of India is used for regulated data flows.
  • DR plan documented; cross-region tested.

Why this shape (not microservices from day 1)

Section titled “Why this shape (not microservices from day 1)”

A pure microservices architecture on day 1 is a mistake for an early NBFC engineering team. Reasons:

  1. Operational overhead — 15 – 20 services from day 1 means 15 – 20 deployment pipelines, observability dashboards, on-call rotations.
  2. Distributed-transaction complexity — multi-service consistency requires either sagas (workflow engine) or eventual consistency (complex to reason about for accounting flows). The modular monolith keeps these as DB transactions until necessary.
  3. Refactoring friction — module boundaries take time to settle; in a monolith refactors are cheap; across microservices they are expensive.
  4. Team velocity — until the platform is shipped and observed, splitting services starves business velocity for premature engineering optimisation.

The right shape: modular monolith with strict module boundaries, with selective extraction once specific scaling / regulatory / partner-isolation triggers fire. See 5.2 Services and modules for the extraction criteria.