Skip to content

6.15 Unconventional data sources for thin-file underwriting

This page is a catalogue with operational detail. Each source has: what it tells you, how to obtain it (vendor / consent), what signal it generates, what fraud patterns to watch.

The goal is to compose 4 – 6 of these sources into a triangulated picture when standard data (GST + bank + bureau + Tally) is incomplete.

For an owner-operated SME running customer payments through UPI on a current or savings account, AA-fetched UPI history reveals:

  • Daily / weekly UPI inflow pattern — proxy for retail / counter sales.
  • Number of unique payer VPAs / accounts — customer base size.
  • Concentration — top payer share.
  • Refund / reversal pattern — return rate proxy.
  • Time-of-day / day-of-week pattern — operational rhythm.
  • Account Aggregator pull (see 4.3). The borrower’s bank account’s UPI-credit transactions are part of the standard DEPOSIT FI type data.
  • PSP-side data (Razorpay, Cashfree merchant dashboards) — borrower may export their own settlement reports.
  • Net business activity for owner-operated retail / services.
  • Concentration risk.
  • Activity continuity (gaps in UPI flow signal trouble).
  • Cycling UPI between own accounts (round-tripping to inflate inflow) — detect via counter-party analysis; if many transactions are between accounts owned by the same person, net out.
  • Pre-loan UPI inflation — borrower instructs friends / family to send UPI just before underwriting; pattern shows sudden spike 2 – 4 weeks before application.
  • VPA spoofing — uncommon but check for repeated identical VPAs at high frequency.
  • AA TSP returns UPI transactions tagged as UPI in the transaction narration. BSA layer categorises.
  • For PSP-side: borrower exports + uploads (less authenticated than AA).

For borrowers using physical POS terminals (Razorpay POS, mPOS, Pine Labs, Innoviti, Mswipe), settlement reports show:

  • Daily card / UPI swipe revenue.
  • Number of transactions.
  • Average ticket size per swipe.
  • Refunds / disputes.
  • Per-terminal breakup if multi-terminal.
  • PSP merchant portal export — borrower downloads settlement reports.
  • API integration with PSPs that expose seller APIs.
  • Bank account analysis — POS settlements come into the borrower’s bank account as identifiable daily / weekly batches; BSA categorisation surfaces them.
  • Retail / hospitality activity baseline.
  • Vintage if reports go back 12+ months.
  • Customer demand stability.
  • Seasonality patterns.
  • Round-tripping via own cards — borrower swipes own card to inflate POS revenue; detect via repeated identical card numbers / pattern of suspiciously round amounts.
  • Settlement-account mismatch — borrower claims POS revenue but settlement goes to a different bank account; verify settlement account.

For sellers on Amazon, Flipkart, Meesho, Myntra, JioMart, etc., settlement reports show:

  • Gross orders + returns + net settlements.
  • Per-marketplace breakup.
  • Category mix.
  • Refund / return rates.
  • Ad-spend (if borrower shares).
  • Cycle history (typically weekly / fortnightly).
  • Marketplace API where available (Amazon SP-API is the most mature; Flipkart Seller API; Meesho Supplier Panel — variable).
  • Borrower upload of CSV / Excel settlement reports.
  • Authenticated screen capture via vendor tools (less ideal).
  • True net revenue (after returns / fees).
  • Demand stability.
  • Operational consistency.
  • Concentration vs marketplace diversity.
  • Fabricated reports — borrower edits CSV to inflate. Verify via marketplace API where possible; reconcile against bank settlement credits.
  • Account multi-pretence — borrower claims multiple marketplace accounts that share underlying ownership; check account-holder names.
  • Cancelled-account hiding — borrower’s primary account was suspended; they share an active secondary account that’s small. Verify account standing.

For invoices above the e-invoice threshold (currently ₹5 Cr aggregate turnover for issuers; threshold reduces periodically), the IRN (Invoice Reference Number) on the GST IRP confirms the invoice’s authenticity in real time. Even thin-file borrowers can produce e-invoices if their buyer issued them — the buyer’s IRN is queryable.

  • GST IRP query API via a GSP.
  • Borrower provides IRN; system verifies.
  • Invoice authenticity (vs forged invoice for invoice-financing fraud).
  • Underlying buyer’s GST identity.
  • Invoice timing (the IRN includes generation timestamp).

Particularly valuable for invoice-backed thin-file lending where the borrower’s overall financial picture is thin but a specific invoice is to be financed. Verifying that the buyer is a real entity issuing real invoices materially reduces fraud risk on that specific draw.

E-way bills are mandatory for goods movement above value thresholds (₹50,000+ typically; state-specific). Pattern of e-way bill generation by a borrower indicates:

  • Logistics activity.
  • Number of consignments.
  • Geographic spread of customers.
  • Vintage of business activity.
  • E-way bill portal via GSP API (limited; primarily for the consignor’s own data).
  • Borrower’s downloaded reports.
  • Goods-business activity for borrowers who issue e-way bills.
  • Vintage if data goes back > 12 months.
  • Customer geographic spread.

E-way bill data is issuer-specific — you see the borrower’s own e-way bills, not their customers’. For verifying a customer’s claim about being a regular buyer, e-way bill data alone is insufficient.

For e-commerce sellers and direct-to-consumer brands using Shiprocket, Delhivery, Ekart, etc., the seller dashboard shows:

  • Daily shipment volume.
  • Order origins / destinations.
  • COD vs prepaid mix.
  • Return / refund rates.
  • Service-level metrics.
  • API if the logistics provider exposes seller API (Shiprocket has one).
  • Borrower download of dashboard exports.
  • True D2C / e-commerce activity (logistics is harder to fake than marketplace claims).
  • Customer geographic spread.
  • COD risk (high COD share = higher return rate).

Logistics data isn’t standardised across providers; integration is per-provider.

Electricity bills (and gas / water where relevant) are a direct proxy for plant / business activity:

  • Monthly KWH consumption — operational intensity.
  • Sudden drops> 30% MoM signals trouble (plant slowdown, owner walked away).
  • Sustained level — operational stability.
  • Connection vintage — how long the business has been operating from this premises.
  • Borrower-uploaded PDFs of last 12 – 24 months bills.
  • Discom API in select states (limited; varies).
  • Field-agent capture during visit.
  • Operational activity (especially valuable for manufacturers where electricity is a major input).
  • Vintage of operations.
  • Premises tenure.
  • Cherry-picked months — borrower shows only good months; require continuous 24-month history.
  • Fabricated bills — verify with discom or check PDF metadata.

Municipal property tax records confirm:

  • The borrower (or related party) owns the property at the claimed business address.
  • Property vintage (period for which tax has been paid).
  • Property value as per municipal record.
  • Borrower-uploaded tax receipts.
  • Municipal portal for some urban municipalities (verification only).
  • Field-agent capture.
  • Asset confirmation (the business has a physical premises owned by borrower or family).
  • Vintage (long-held property suggests stable family / business).
  • Recovery anchor — if loan defaults, owned property is a legal recovery vector (subject to SARFAESI applicability for secured products).

For borrowers without netbanking but with a current / savings account that’s actively used, photographed pages of the passbook can be OCR’d to extract transaction history.

  • Borrower-uploaded clear photos of passbook pages.
  • OCR vendor (Karza, Perfios, custom) extracts.
  • Basic transaction pattern.
  • Vintage (passbook printed dates).
  • Account activity continuity.
  • Quality varies wildly; clear photos required.
  • OCR can mis-read handwritten amounts.
  • Tampering harder to detect than PDF.
  • Use as supplementary signal only, not primary.

J. Photographic and field-observational evidence

Section titled “J. Photographic and field-observational evidence”

Photos captured by field agent during FI visit:

  • Shop front + signage — confirms business presence, vintage hints.
  • Inside view — stock visible, layout, fit-out condition.
  • Equipment / machinery — for manufacturers.
  • Stock register — handwritten in many SMEs.
  • Owner photo at premises — confirms owner-operator status.
  • Employee count visible.
  • Field-agent app with geo-tagged photo capture mandatory.
  • Confirms business is real and operating (most important for thin-file).
  • Implies scale (inventory visible).
  • Captures fit-out vintage.
  • Staged premises — borrower borrows / rents someone else’s premises for the visit. Mitigate via surprise visit + check neighbour confirmation.
  • Photo manipulation — geo-tag + EXIF + capture-time enforce honesty.

K. Reference checks (customers + suppliers + neighbours)

Section titled “K. Reference checks (customers + suppliers + neighbours)”

Structured reference calls to 3 + 3 + 2 references (3 customers + 3 suppliers + 2 neighbours) confirm:

  • Borrower’s business relationship existence and tenure.
  • Volume / frequency of dealings.
  • Payment behaviour (whether borrower pays suppliers on time, whether customers pay borrower on time).
  • Reputation in the local market.
  • Borrower provides reference list with consent for being contacted.
  • Field agent visits / call agent calls with structured script; calls recorded for audit.
  • High-quality reference checks are among the strongest thin-file signals.
  • Detects fraud (fake business with no real customers / suppliers fails reference checks).

L. Anchor / manufacturer confirmation letters

Section titled “L. Anchor / manufacturer confirmation letters”

For distributors / dealers / vendors of a larger manufacturer, an anchor confirmation letter (or anchor’s own portal data) shows:

  • Distributorship vintage with the anchor.
  • Monthly purchase volume.
  • Payment behaviour with anchor.
  • Distributorship status (active / under-review / on-watch).
  • Borrower obtains and uploads the anchor’s letter.
  • Direct verification with anchor’s commercial team.
  • Anchor portal if anchor provides distributor-portal access to lenders (rare for casual referrals; standard for formal channel-finance programmes).
  • Very high-confidence signal of business reality + scale.
  • Replaces much of standard data when anchor is reliable.

M. Trade-credit history with named suppliers

Section titled “M. Trade-credit history with named suppliers”

Borrower-declared list of 5 – 10 regular suppliers with whom they have running credit (typical 15 – 45 day credit terms). Verification via:

  • Supplier call confirming relationship + payment behaviour.
  • Sample invoices from supplier showing credit terms.
  • Implies the borrower has trade-credit standing in the local market.
  • Suggests payment discipline (no supplier would extend credit otherwise).

N. Footfall data (mobile-derived) for retail / hospitality

Section titled “N. Footfall data (mobile-derived) for retail / hospitality”

For retail counters and hospitality (restaurants, salons, clinics), anonymised footfall data derived from mobile-location data (where vendors offer it within DPDP-compliant frameworks) shows:

  • Customer footfall trend.
  • Peak hours / days.
  • Comparison with category benchmarks.
  • Specialised vendors (rare in India today; growing).
  • Borrower’s own POS / camera-based footfall counter (some new restaurants / clinics use these).
  • Demand stability for footfall-dependent businesses.
  • Coverage is uneven.
  • DPDP implications need careful handling.

O. Social-media presence and customer reviews

Section titled “O. Social-media presence and customer reviews”

Public business pages and review aggregators (Google My Business, JustDial, Sulekha, IndiaMART, marketplace seller pages) show:

  • Business presence and longevity.
  • Customer review volume and sentiment.
  • Photo content showing the business.
  • Operating hours, contact details.
  • Public search (via internal tool that aggregates).
  • Borrower’s claimed handles verified.
  • Validates business presence and customer-facing reality.
  • Reviews signal customer satisfaction (proxy for business health).
  • Some businesses don’t have any online presence; absence isn’t strong negative signal.
  • Reviews can be fake (positive or negative).

P. SHG / JLG history (for very small entities)

Section titled “P. SHG / JLG history (for very small entities)”

For borrowers with microfinance / SHG (Self-Help Group) / JLG (Joint Liability Group) history:

  • Vintage of group membership.
  • Loan repayment record within group.
  • Group’s overall performance.
  • MFI bureau (CIBIL TransUnion MFI / CRIF Highmark MFI) — coverage of MFI / SHG records.
  • MFI direct confirmation if borrower had specific loans.
  • Demonstrated payment discipline at small scale; suggests willingness if not capacity to repay.
  • Vintage of formal credit-system engagement.

What it tells you (in a thin-file underwriting context, distinct from fraud)

Section titled “What it tells you (in a thin-file underwriting context, distinct from fraud)”
  • Phone-number vintage — long-held number suggests stability.
  • Phone-PAN match — basic linkage.
  • Location history pattern (with explicit consent + DPDP-compliant vendor) — confirms borrower spends time at the claimed business address.
  • App-usage pattern — suggests digital-savvy if borrower uses banking / merchant apps.
  • Mobile-intel vendors (TruValidate, Bureau, IDfy) for vintage and pattern.
  • Borrower-consented mobile data collection — handle DPDP carefully.
  • Stability signals.
  • Fraud avoidance.

Strong consent + purpose limitation requirements. Don’t collect location data for thin-file underwriting unless the borrower explicitly understands and consents.

R. Cooperative society / association membership

Section titled “R. Cooperative society / association membership”

For borrowers who are members of local trade associations, cooperative societies, market-area associations, etc.:

  • Membership vintage.
  • Standing within the association.
  • Other members’ references.
  • Borrower-provided membership documents.
  • Association direct confirmation.
  • Embeddedness in local business community.
  • Often a strong reference network access point.

A thin-file underwriting decision typically combines 4 – 6 signals above with the standard signals available. Example composition for a roadside chemist applying for ₹3 lakh line:

  1. Bank statement via AA (6 months available; ABB ₹40k).
  2. UPI inflow pattern via AA (200 unique payers / month).
  3. Drug Licence (active, 7 years vintage).
  4. Anchor manufacturer confirmation (8 years distributorship with ₹2 Cr annual purchases).
  5. Field FI (chemist visible, stock substantial, owner present, signage 8 years old).
  6. Reference calls (3 customers + 3 suppliers confirmed).

This composition is richer than the standard GST + bank + bureau for many SMEs and unlocks credit that would otherwise be declined.

  • DPDP — every data source has consent + purpose + retention rules.
  • Digital Lending Guidelines — data minimisation; only collect what’s needed for the specific underwriting purpose.
  • Outsourcing MD — every vendor governed.
  • Fair Practices Code — borrower communication of which data is being used.