6.15 Unconventional data sources for thin-file underwriting
This page is a catalogue with operational detail. Each source has: what it tells you, how to obtain it (vendor / consent), what signal it generates, what fraud patterns to watch.
The goal is to compose 4 – 6 of these sources into a triangulated picture when standard data (GST + bank + bureau + Tally) is incomplete.
A. UPI transaction data via AA
Section titled “A. UPI transaction data via AA”What it tells you
Section titled “What it tells you”For an owner-operated SME running customer payments through UPI on a current or savings account, AA-fetched UPI history reveals:
- Daily / weekly UPI inflow pattern — proxy for retail / counter sales.
- Number of unique payer VPAs / accounts — customer base size.
- Concentration — top payer share.
- Refund / reversal pattern — return rate proxy.
- Time-of-day / day-of-week pattern — operational rhythm.
How to obtain
Section titled “How to obtain”- Account Aggregator pull (see 4.3). The borrower’s bank account’s UPI-credit transactions are part of the standard
DEPOSITFI type data. - PSP-side data (Razorpay, Cashfree merchant dashboards) — borrower may export their own settlement reports.
Signal
Section titled “Signal”- Net business activity for owner-operated retail / services.
- Concentration risk.
- Activity continuity (gaps in UPI flow signal trouble).
Fraud patterns
Section titled “Fraud patterns”- Cycling UPI between own accounts (round-tripping to inflate inflow) — detect via counter-party analysis; if many transactions are between accounts owned by the same person, net out.
- Pre-loan UPI inflation — borrower instructs friends / family to send UPI just before underwriting; pattern shows sudden spike
2 – 4 weeksbefore application. - VPA spoofing — uncommon but check for repeated identical VPAs at high frequency.
Integration
Section titled “Integration”- AA TSP returns UPI transactions tagged as
UPIin the transaction narration. BSA layer categorises. - For PSP-side: borrower exports + uploads (less authenticated than AA).
B. POS settlement data
Section titled “B. POS settlement data”What it tells you
Section titled “What it tells you”For borrowers using physical POS terminals (Razorpay POS, mPOS, Pine Labs, Innoviti, Mswipe), settlement reports show:
- Daily card / UPI swipe revenue.
- Number of transactions.
- Average ticket size per swipe.
- Refunds / disputes.
- Per-terminal breakup if multi-terminal.
How to obtain
Section titled “How to obtain”- PSP merchant portal export — borrower downloads settlement reports.
- API integration with PSPs that expose seller APIs.
- Bank account analysis — POS settlements come into the borrower’s bank account as identifiable daily / weekly batches; BSA categorisation surfaces them.
Signal
Section titled “Signal”- Retail / hospitality activity baseline.
- Vintage if reports go back
12+ months. - Customer demand stability.
- Seasonality patterns.
Fraud patterns
Section titled “Fraud patterns”- Round-tripping via own cards — borrower swipes own card to inflate POS revenue; detect via repeated identical card numbers / pattern of suspiciously round amounts.
- Settlement-account mismatch — borrower claims POS revenue but settlement goes to a different bank account; verify settlement account.
C. Marketplace settlement data
Section titled “C. Marketplace settlement data”What it tells you
Section titled “What it tells you”For sellers on Amazon, Flipkart, Meesho, Myntra, JioMart, etc., settlement reports show:
- Gross orders + returns + net settlements.
- Per-marketplace breakup.
- Category mix.
- Refund / return rates.
- Ad-spend (if borrower shares).
- Cycle history (typically
weekly / fortnightly).
How to obtain
Section titled “How to obtain”- Marketplace API where available (Amazon SP-API is the most mature; Flipkart Seller API; Meesho Supplier Panel — variable).
- Borrower upload of CSV / Excel settlement reports.
- Authenticated screen capture via vendor tools (less ideal).
Signal
Section titled “Signal”- True net revenue (after returns / fees).
- Demand stability.
- Operational consistency.
- Concentration vs marketplace diversity.
Fraud patterns
Section titled “Fraud patterns”- Fabricated reports — borrower edits CSV to inflate. Verify via marketplace API where possible; reconcile against bank settlement credits.
- Account multi-pretence — borrower claims multiple marketplace accounts that share underlying ownership; check account-holder names.
- Cancelled-account hiding — borrower’s primary account was suspended; they share an active secondary account that’s small. Verify account standing.
D. GST e-invoice IRN verification
Section titled “D. GST e-invoice IRN verification”What it tells you
Section titled “What it tells you”For invoices above the e-invoice threshold (currently ₹5 Cr aggregate turnover for issuers; threshold reduces periodically), the IRN (Invoice Reference Number) on the GST IRP confirms the invoice’s authenticity in real time. Even thin-file borrowers can produce e-invoices if their buyer issued them — the buyer’s IRN is queryable.
How to obtain
Section titled “How to obtain”- GST IRP query API via a GSP.
- Borrower provides IRN; system verifies.
Signal
Section titled “Signal”- Invoice authenticity (vs forged invoice for invoice-financing fraud).
- Underlying buyer’s GST identity.
- Invoice timing (the IRN includes generation timestamp).
Use case
Section titled “Use case”Particularly valuable for invoice-backed thin-file lending where the borrower’s overall financial picture is thin but a specific invoice is to be financed. Verifying that the buyer is a real entity issuing real invoices materially reduces fraud risk on that specific draw.
E. E-way bill data
Section titled “E. E-way bill data”What it tells you
Section titled “What it tells you”E-way bills are mandatory for goods movement above value thresholds (₹50,000+ typically; state-specific). Pattern of e-way bill generation by a borrower indicates:
- Logistics activity.
- Number of consignments.
- Geographic spread of customers.
- Vintage of business activity.
How to obtain
Section titled “How to obtain”- E-way bill portal via GSP API (limited; primarily for the consignor’s own data).
- Borrower’s downloaded reports.
Signal
Section titled “Signal”- Goods-business activity for borrowers who issue e-way bills.
- Vintage if data goes back
> 12 months. - Customer geographic spread.
Caveats
Section titled “Caveats”E-way bill data is issuer-specific — you see the borrower’s own e-way bills, not their customers’. For verifying a customer’s claim about being a regular buyer, e-way bill data alone is insufficient.
F. Logistics provider seller dashboards
Section titled “F. Logistics provider seller dashboards”What it tells you
Section titled “What it tells you”For e-commerce sellers and direct-to-consumer brands using Shiprocket, Delhivery, Ekart, etc., the seller dashboard shows:
- Daily shipment volume.
- Order origins / destinations.
- COD vs prepaid mix.
- Return / refund rates.
- Service-level metrics.
How to obtain
Section titled “How to obtain”- API if the logistics provider exposes seller API (Shiprocket has one).
- Borrower download of dashboard exports.
Signal
Section titled “Signal”- True D2C / e-commerce activity (logistics is harder to fake than marketplace claims).
- Customer geographic spread.
- COD risk (high COD share = higher return rate).
Caveats
Section titled “Caveats”Logistics data isn’t standardised across providers; integration is per-provider.
G. Utility bills
Section titled “G. Utility bills”What it tells you
Section titled “What it tells you”Electricity bills (and gas / water where relevant) are a direct proxy for plant / business activity:
- Monthly KWH consumption — operational intensity.
- Sudden drops —
> 30%MoM signals trouble (plant slowdown, owner walked away). - Sustained level — operational stability.
- Connection vintage — how long the business has been operating from this premises.
How to obtain
Section titled “How to obtain”- Borrower-uploaded PDFs of last
12 – 24 monthsbills. - Discom API in select states (limited; varies).
- Field-agent capture during visit.
Signal
Section titled “Signal”- Operational activity (especially valuable for manufacturers where electricity is a major input).
- Vintage of operations.
- Premises tenure.
Fraud patterns
Section titled “Fraud patterns”- Cherry-picked months — borrower shows only good months; require continuous 24-month history.
- Fabricated bills — verify with discom or check PDF metadata.
H. Property tax records
Section titled “H. Property tax records”What it tells you
Section titled “What it tells you”Municipal property tax records confirm:
- The borrower (or related party) owns the property at the claimed business address.
- Property vintage (period for which tax has been paid).
- Property value as per municipal record.
How to obtain
Section titled “How to obtain”- Borrower-uploaded tax receipts.
- Municipal portal for some urban municipalities (verification only).
- Field-agent capture.
Signal
Section titled “Signal”- Asset confirmation (the business has a physical premises owned by borrower or family).
- Vintage (long-held property suggests stable family / business).
- Recovery anchor — if loan defaults, owned property is a legal recovery vector (subject to SARFAESI applicability for secured products).
I. Bank passbook OCR
Section titled “I. Bank passbook OCR”What it tells you
Section titled “What it tells you”For borrowers without netbanking but with a current / savings account that’s actively used, photographed pages of the passbook can be OCR’d to extract transaction history.
How to obtain
Section titled “How to obtain”- Borrower-uploaded clear photos of passbook pages.
- OCR vendor (Karza, Perfios, custom) extracts.
Signal
Section titled “Signal”- Basic transaction pattern.
- Vintage (passbook printed dates).
- Account activity continuity.
Caveats
Section titled “Caveats”- Quality varies wildly; clear photos required.
- OCR can mis-read handwritten amounts.
- Tampering harder to detect than PDF.
- Use as supplementary signal only, not primary.
J. Photographic and field-observational evidence
Section titled “J. Photographic and field-observational evidence”What it tells you
Section titled “What it tells you”Photos captured by field agent during FI visit:
- Shop front + signage — confirms business presence, vintage hints.
- Inside view — stock visible, layout, fit-out condition.
- Equipment / machinery — for manufacturers.
- Stock register — handwritten in many SMEs.
- Owner photo at premises — confirms owner-operator status.
- Employee count visible.
How to obtain
Section titled “How to obtain”- Field-agent app with geo-tagged photo capture mandatory.
Signal
Section titled “Signal”- Confirms business is real and operating (most important for thin-file).
- Implies scale (inventory visible).
- Captures fit-out vintage.
Fraud patterns
Section titled “Fraud patterns”- Staged premises — borrower borrows / rents someone else’s premises for the visit. Mitigate via surprise visit + check neighbour confirmation.
- Photo manipulation — geo-tag + EXIF + capture-time enforce honesty.
K. Reference checks (customers + suppliers + neighbours)
Section titled “K. Reference checks (customers + suppliers + neighbours)”What it tells you
Section titled “What it tells you”Structured reference calls to 3 + 3 + 2 references (3 customers + 3 suppliers + 2 neighbours) confirm:
- Borrower’s business relationship existence and tenure.
- Volume / frequency of dealings.
- Payment behaviour (whether borrower pays suppliers on time, whether customers pay borrower on time).
- Reputation in the local market.
How to obtain
Section titled “How to obtain”- Borrower provides reference list with consent for being contacted.
- Field agent visits / call agent calls with structured script; calls recorded for audit.
Signal
Section titled “Signal”- High-quality reference checks are among the strongest thin-file signals.
- Detects fraud (fake business with no real customers / suppliers fails reference checks).
Detail in 6.16.
Section titled “Detail in 6.16.”L. Anchor / manufacturer confirmation letters
Section titled “L. Anchor / manufacturer confirmation letters”What it tells you
Section titled “What it tells you”For distributors / dealers / vendors of a larger manufacturer, an anchor confirmation letter (or anchor’s own portal data) shows:
- Distributorship vintage with the anchor.
- Monthly purchase volume.
- Payment behaviour with anchor.
- Distributorship status (active / under-review / on-watch).
How to obtain
Section titled “How to obtain”- Borrower obtains and uploads the anchor’s letter.
- Direct verification with anchor’s commercial team.
- Anchor portal if anchor provides distributor-portal access to lenders (rare for casual referrals; standard for formal channel-finance programmes).
Signal
Section titled “Signal”- Very high-confidence signal of business reality + scale.
- Replaces much of standard data when anchor is reliable.
Detail in 6.18.
Section titled “Detail in 6.18.”M. Trade-credit history with named suppliers
Section titled “M. Trade-credit history with named suppliers”What it tells you
Section titled “What it tells you”Borrower-declared list of 5 – 10 regular suppliers with whom they have running credit (typical 15 – 45 day credit terms). Verification via:
- Supplier call confirming relationship + payment behaviour.
- Sample invoices from supplier showing credit terms.
Signal
Section titled “Signal”- Implies the borrower has trade-credit standing in the local market.
- Suggests payment discipline (no supplier would extend credit otherwise).
N. Footfall data (mobile-derived) for retail / hospitality
Section titled “N. Footfall data (mobile-derived) for retail / hospitality”What it tells you
Section titled “What it tells you”For retail counters and hospitality (restaurants, salons, clinics), anonymised footfall data derived from mobile-location data (where vendors offer it within DPDP-compliant frameworks) shows:
- Customer footfall trend.
- Peak hours / days.
- Comparison with category benchmarks.
How to obtain
Section titled “How to obtain”- Specialised vendors (rare in India today; growing).
- Borrower’s own POS / camera-based footfall counter (some new restaurants / clinics use these).
Signal
Section titled “Signal”- Demand stability for footfall-dependent businesses.
Caveats
Section titled “Caveats”- Coverage is uneven.
- DPDP implications need careful handling.
O. Social-media presence and customer reviews
Section titled “O. Social-media presence and customer reviews”What it tells you
Section titled “What it tells you”Public business pages and review aggregators (Google My Business, JustDial, Sulekha, IndiaMART, marketplace seller pages) show:
- Business presence and longevity.
- Customer review volume and sentiment.
- Photo content showing the business.
- Operating hours, contact details.
How to obtain
Section titled “How to obtain”- Public search (via internal tool that aggregates).
- Borrower’s claimed handles verified.
Signal
Section titled “Signal”- Validates business presence and customer-facing reality.
- Reviews signal customer satisfaction (proxy for business health).
Caveats
Section titled “Caveats”- Some businesses don’t have any online presence; absence isn’t strong negative signal.
- Reviews can be fake (positive or negative).
P. SHG / JLG history (for very small entities)
Section titled “P. SHG / JLG history (for very small entities)”What it tells you
Section titled “What it tells you”For borrowers with microfinance / SHG (Self-Help Group) / JLG (Joint Liability Group) history:
- Vintage of group membership.
- Loan repayment record within group.
- Group’s overall performance.
How to obtain
Section titled “How to obtain”- MFI bureau (CIBIL TransUnion MFI / CRIF Highmark MFI) — coverage of MFI / SHG records.
- MFI direct confirmation if borrower had specific loans.
Signal
Section titled “Signal”- Demonstrated payment discipline at small scale; suggests willingness if not capacity to repay.
- Vintage of formal credit-system engagement.
Q. Mobile / device intelligence
Section titled “Q. Mobile / device intelligence”What it tells you (in a thin-file underwriting context, distinct from fraud)
Section titled “What it tells you (in a thin-file underwriting context, distinct from fraud)”- Phone-number vintage — long-held number suggests stability.
- Phone-PAN match — basic linkage.
- Location history pattern (with explicit consent + DPDP-compliant vendor) — confirms borrower spends time at the claimed business address.
- App-usage pattern — suggests digital-savvy if borrower uses banking / merchant apps.
How to obtain
Section titled “How to obtain”- Mobile-intel vendors (TruValidate, Bureau, IDfy) for vintage and pattern.
- Borrower-consented mobile data collection — handle DPDP carefully.
Signal
Section titled “Signal”- Stability signals.
- Fraud avoidance.
DPDP considerations
Section titled “DPDP considerations”Strong consent + purpose limitation requirements. Don’t collect location data for thin-file underwriting unless the borrower explicitly understands and consents.
R. Cooperative society / association membership
Section titled “R. Cooperative society / association membership”What it tells you
Section titled “What it tells you”For borrowers who are members of local trade associations, cooperative societies, market-area associations, etc.:
- Membership vintage.
- Standing within the association.
- Other members’ references.
How to obtain
Section titled “How to obtain”- Borrower-provided membership documents.
- Association direct confirmation.
Signal
Section titled “Signal”- Embeddedness in local business community.
- Often a strong reference network access point.
How to combine these signals
Section titled “How to combine these signals”A thin-file underwriting decision typically combines 4 – 6 signals above with the standard signals available. Example composition for a roadside chemist applying for ₹3 lakh line:
- Bank statement via AA (
6 monthsavailable; ABB₹40k). - UPI inflow pattern via AA (
200 unique payers / month). - Drug Licence (active,
7 yearsvintage). - Anchor manufacturer confirmation (
8 yearsdistributorship with₹2 Crannual purchases). - Field FI (chemist visible, stock substantial, owner present, signage
8 yearsold). - Reference calls (
3 customers+3 suppliersconfirmed).
This composition is richer than the standard GST + bank + bureau for many SMEs and unlocks credit that would otherwise be declined.
Compliance touchpoints
Section titled “Compliance touchpoints”- DPDP — every data source has consent + purpose + retention rules.
- Digital Lending Guidelines — data minimisation; only collect what’s needed for the specific underwriting purpose.
- Outsourcing MD — every vendor governed.
- Fair Practices Code — borrower communication of which data is being used.