AI Lending Platform TransformationA Fintech Success Story Meant to Scale

When Everything Was on the Line

In early 2024, a leading Artificial Intelligence (AI)–driven fintech platform transforming how financial services operate in India reached out for support. Their no-code platform was designed to simplify loan origination, KYC processing, and personalized offer generation — but just before launch, challenges emerged.

“We expect 3,000 to 4,000 users simultaneously during promotional spikes, and our AI agents will need to process thousands of credit applications in real time,” said the CTO.
“Our on-premises MySQL deployment can’t scale fast enough without risking downtime or compliance issues.”

The Harsh Reality We Unearthed

From our assessment, the infrastructure was in complete disarray. During testing, legacy hardware struggled to handle even simple workloads, resulting in query latencies exceeding 500 ms. For loan decisions, which must occur rapidly, this level of delay is unacceptable—every millisecond counts, and opportunities can be lost. Single-instance databases pose additional risks, as they can fail unexpectedly. Furthermore, in line with regulatory guidelines, borrower data that is not properly encrypted or monitored may be at risk of compromise. This isn’t just a technical concern; outages or slow processing in lending can erode customer trust and attract regulatory scrutiny, turning potential growth into tangible risks.

Building Something That Might Actually Work

We built a robust foundation using Amazon RDS for PostgreSQL in a Multi-AZ deployment, allowing loan processing to continue uninterrupted with dynamic switching between availability zones. The database was designed for 200 tables, 140 constraints, 60 indexes, 25 sequences, and 10 triggers, ensuring smooth operation of modular ECS-based microservices. It started at 150 GB, with a planned 30% annual growth.

Read replicas across zones handled reporting and analytics, reducing read-heavy traffic on the main instance—like adding express lanes to a slow-moving motorway—keeping offer generation under 100 ms even during peak hours. For CI/CD, we implemented OIDC with GitHub Actions to publish container images to ECR and run ECS tasks for deployments. This setup allowed AI-driven processes to evolve securely without interrupting the credit flow.

Making Sure Everything Can Be Watched and Is Secured

We integrated RDS Performance Insights with Amazon CloudWatch to ensure complete visibility. A real-time dashboard tracked CPU, IOPS, replication latency, and query times. Before issues—such as connection spikes from ELB-routed API calls—could affect customers, SNS notifications sent detailed alerts.

For security, AWS KMS protected data at rest and in transit with SSL, while Secrets Manager rotated service passwords every 90 days, creating a highly secure environment for sensitive information. IAM roles enforced least-privilege access, giving users visibility into specific reports while limiting ECS task access. VPC security—via Security Groups, subnets, and NAT Gateways—isolated ingress and egress traffic, safeguarding KYC documents and financial data. Additionally, GuardDuty detected anomalies without adding latency, helping the platform stay compliant with RBI requirements.

Addressing Real Time Processing Challenges

The AI-powered lending platform required near real-time data to generate consumer offers. To handle this, we leveraged read replicas and auto-scaling to efficiently manage 3,000–4,000 concurrent connections during campaign spikes. Route 53 ensured DNS failover, while CloudFront cached static compliance assets closer to the edge. This setup delivered fast performance, allowing user interactions across India in under a second.

The Move That Needed to Be Perfect

Migrating from on-premise MySQL to RDS PostgreSQL posed significant challenges. We used the AWS Schema Conversion Tool (SCT) to assess compatibility and Database Migration Service (DMS) to move data and schemas in UAT mode. Leveraging Infrastructure as Code, Terraform spun up the environment consistently every time, and blue-green deployments on ECS kept downtime under 15 minutes. A fully tested contingency plan—on-premise read-only mode with S3 snapshot restoration—ensured no loan records would be lost.

Launch day attracted tremendous interest, with thousands of applications submitted. The system scaled seamlessly: Multi-AZ failover handled simulated zone loss in seconds, read replicas absorbed the read demand, and auto-scaling expanded capacity automatically. At peak load, warm queries at p95 remained under 20 ms, connections peaked at 4,000 without affecting responsiveness, and the platform maintained 99.99% uptime—delivering thousands of AI-driven outcomes flawlessly.

Not just operational, but effective

There were unexpected wins as well. Over 95% of buffer cache hits were successful, and monthly CloudWatch checks ensured the database stayed appropriately sized. Costs were reduced by archiving old loan data using S3 lifecycle rules. Even with over 30% annual growth, a combination of reserved instance pricing and on-demand scaling reduced total cost of ownership by more than 50%. Lambda automations further helped prevent issues such as replication latency before they could impact the platform.

What we learned along the way

Just a note, for finance workloads that are reading heavy, read replicas can help prevent bottlenecks by adding more reads in an event time period. Performance Insights can help us identify slow queries early and adjust indexes for AI routeing.  OIDC-driven CI/CD allows us to deploy versions with confidence, while ECS ensures upgrades can occur without downtime. VPC isolation and KMS rotations increased security significantly without additional labour to meet RBI requests.

Keeping Security Robust

During the setup, we required hardware MFA for root access, logged all activity in CloudTrail with S3 backups, and conducted IAM reviews on a quarterly basis.   Our de-sanitization scripts used fake data for testing, to obscure PII in non-production use cases. This ensured private loan information remained separate and legal. ​ 

Life in the Large

RDS enables fintech organizations to churn out new ideas at scale by permitting real-time loans to thousands of people. Crego is a great example. Their AI keeps customers engaged from KYC to offers, encouraging digital finance trust. This positive change wasn’t just about technology, it advocated to streamline loan access when borrowers needed it most, through a stable infrastructure in the background. ​

Architecture Diagram

Metrics Demonstrate the Validity: 99.99% uptime during peak traffic, API responses under 100ms, RTO under 15 minutes, and RPO under 5 minutes through Cross-Region Replication. People were able to support any surges, because the auto-scaling did all the work. Crego processed millions of loans without any issue, and still provided a simple and secure way for individuals to obtain credit. Clearly, this demonstrates that a scalable infrastructure delivers a needed service of providing access to money.

Trusted By

company persons on meeting

Doubling Scalability & Reducing Cloud Costs by one-third

The company  offers AI-powered meeting insights, helping optimize meetings through advanced analytics.

i pad in hand with check scalability

Scaling a Media House to handle 200k Concurrent Users

An innovative platform launched by NDTV founders, aimed to provide real-time updates and insights during election campaigns.

purple world globe image

3x Faster Migration & 70% Cost Efficiency

We implemented an infrastructure supporting IPv6, enhancing customer experience & ensuring efficient migration of services to EKS.

Show All Collapse