FINRA - Implementing Self-Service Cloud Scalability Across the Organization

geetharam · August 2023

Team members:

Geetha Ramachandran, Senior Director, Technology
Paul Schiavone, Senior Director, Market Regulation
Alexey Egorov, Senior Data Scientist, Market Regulation
Wangfa Zhao, Senior Data Scientist, Market Regulation
Tate Welty, Data Scientist, Market Regulation
David Ptashny, Director, Member Supervision
Julian Asano, Senior Principal Specialist, Member Supervision
Alejandro Colocho, Developer, Member Supervision
Zihan Shao, Developer, Member Supervision
Otto Scheel, Lead Developer, Technology
Huriel Hernandez, Senior Developer, Technology
Maksim Abadjev, Staff Developer, technology
Graham Adachi-Kriege, Developer, Technology
Peter Hopper, Application Engineer, Technology
Robin Ma, Data Engineer, Technology
Fumin Yang, Director, Product Management, Technology
Helen Benton, Product Manager, Technology

Country: United States

Organization: FINRA

FINRA is a not-for-profit organization dedicated to investor protection and market integrity. It regulates one critical part of the securities industry—brokerage firms doing business with the public in the United States. FINRA, overseen by the SEC, writes rules, examines for and enforces compliance with FINRA rules and federal securities laws, registers broker-dealer personnel and offers them education and training, and informs the investing public. In addition, FINRA provides surveillance and other regulatory services for equities and options markets, as well as trade reporting and other industry utilities. FINRA also administers a dispute resolution forum for investors and brokerage firms and their registered employees.

We use innovative AI and machine learning technologies to keep a close eye on the market and provide essential support to investors, regulators, policymakers and other stakeholders.

Awards Categories:

Best Positive Impact Use Case
Best Data Democratization Program
Best Approach for Building Trust in AI

Business Challenge:

Organizational background:

The Financial Industry Regulatory Authority (FINRA) is a not-for-profit organization authorized by the U.S. Congress to protect investors and ensure market integrity through effective and efficient regulation of broker-dealers. It writes and enforces rules governing the activities of more than 3,400 broker-dealers representing more than 630,000 brokers, examines firms for compliance, fosters market transparency, and educates investors.

Scale of Operations:

Each day, FINRA oversees nearly 600 billion market events across equities, options, and fixed income products within the U.S. This results in petabytes of historical data. The primary challenge is deciphering this vast data pool to identify malicious activities, such as insider trading, and more, which can tilt the market scales unfairly.

Given the unpredictable nature of today’s markets, swift response mechanisms are crucial. At the heart of FINRA’s operations is the objective to ensure investor safety and uphold market integrity. This necessitates rapid data analysis to answer pivotal questions, including the identification of market threats and required regulatory interventions. The ability to analyze vast amounts of data swiftly from diverse sources is fundamental to our mission.

We leveraged Predefined compute cluster approach, as that is one of the recommended approaches for managing compute for the analytics workloads during the pilot period in Dataiku. This addressed some of our scalability needs. However, due to FINRA’s significant data volume, we anticipated capacity bottlenecks using this approach. With these limitations, jobs can become backlogged, causing considerable analysis delays, compromising swift action, and hampering user experience. In addition, users would be dependent on the platform administration team for tailored cluster configurations.

To navigate these challenges, FINRA adopted a more self-sufficient, automated, and team managed strategy for users to adapt computing power to the task at hand, without being limited by the computational power of their laptops or the pre-defined clusters offered by administrators. In this evolved system, each team is empowered with the responsibility of launching, upkeep, and cost management of their respective clusters. This shared responsibility ensures bespoke cluster configurations, fostering rapid and efficient data analysis.

Business Solution:

We developed a set of macros called Node Launcher in Dataiku based on Kubernetes. This capability allows users to define the bounds and limits of computational power for each project, thereby eliminating the constraints imposed by computing guidelines.

However, with the increase in user adoption, the need for implementing guardrails became evident. These include:

Provisioning of clusters: To ensure proper usage, Node launcher has authorization system in place, allowing teams to provision and use their designated clusters, based on their project and group membership.
Self-service capability: Using Node Launcher, users can create node groups for their needs and launch compute environment with the instance type, leverage multiple node groups for their diverse workloads and all these node groups are scaled up and down automatically without any manual interventions.
Performance tuning: With numerous clusters running on cloud providers, there was a need for extensive scaling and tuning, primarily focusing on optimizing startup time, networking improvements, and Spark configurations.
Security: Specific security improvements using S3 connections were implemented to ensure secure data access based on group membership.
Cost optimization measures:
- We are leveraging AWS spot instances for non-critical workloads, which are only 10% of the cost of EC2 and can be terminated at any time.
- We developed Cost dashboard and Usage dashboard, allowing users to understand the cost breakdown across projects, Dataiku stacks (e.g., design, production), and users. This not only helps users manage their costs but also aids development teams in analyzing their activity.
- To ensure optimal usage, cost clinics are conducted each month, providing diagnostics of each cost and recommendations for optimization. A checklist of 5-10 recommendations is also provided to ensure that a project is optimized.
Resource Utilization metrics: We also built tools to analyze resource utilization and performance, which assist with overall cluster management. These are crucial in ensuring that the analysis process is efficient and effective, despite the large volumes of data involved.

To fortify this capability, we have Community of Practice sessions, providing a forum for teams to discuss challenges, solutions and learn from one another’s experiences. We also document user stories that are catalogued in a central repository, offering a resource for teams to reference, learn and avoid reinventing the wheel. This approach not only resolve current challenges but also imbibe a culture of collaboration, continuous learning, and shared responsibility.

Business Area Enhanced: Analytics

Use Case Stage: In Production

Value Generated:

Scalable architecture: We are seeing an outstanding scalability of the entire architecture, with:
- petabytes of source data
- more than 18 terabytes of analyzed data,
- comprising of 8 million data objects created using Dataiku.
- Any given day sees on average 500+ EC2 nodes clusters, 400+ jobs, and 200+ web applications running.
DataOps driving efficiency and accuracy: We have published 12 prototype risk models in Dataiku, as webapps. 180 explorer users interact with it daily. The risk models have enabled them to perform over 20,000 risk assessments in the past 10 months, with a more efficient, standard, and consistent risk analysis.
Self-service analytics: This architecture has provided a self-service capability in practice for our analysts, enabling them to leverage the elasticity of the cloud for any task. It has significantly broadened the scope of data analysis, making it more accessible and efficient.
Insights democratization: Explorer users interact with custom webapps in the dataiku platform, which allows to make the ideal frontend and display the product of complex analytics to the users so that they can get the information quickly and easily and move on to the next task.

Value Brought by Dataiku:

Dataiku specifically enabled this positive impact by:

Incorporating Spark capabilities
Utilizing a Kubernetes backend to scale, in conjunction with the AWS integration
Providing an API SDK for efficient management
Featuring webapps to allow users to quickly develop and test risk models before putting it into production.

This led the entire organization to conducting more efficient data analysis in reaction to market events, making more informed and faster decisions which may have nation-wide impact.

Dataiku has provided a self-service analytics capability where users are not limited by computational power. Leverage elasticity from the cloud to analyze the data based on their needs. Not limited by computational power of their laptop or pre-defined clusters that admins offered.

Value Type:

Improve customer/employee satisfaction
Reduce cost
Reduce risk
Save time

Value Range: Millions of $

FINRA - Implementing Self-Service Cloud Scalability Across the Organization

Business Challenge:

Business Solution:

Value Generated:

Value Brought by Dataiku:

Categories

Setup Info

Tags