Fintech Insights

The Transatlantic Challenge: Getting a Handle on Trade Data

October 18, 2018

The Transatlantic Challenge: Getting a Handle on Trade Data

As has been well documented, the raft of post-crisis regulatory changes in the two largest securities markets in the world has had wide ramifications for all industry players, with arguably the most visible and transparent (pun intended!) being in the sphere of transaction reporting.

The SEC’s Rule 613 aka Consolidated Audit Trail (CAT) in the U.S. and EC’s double whammy of EMIR and MiFIR/MiFID II in Europe have placed an onerous burden on firms to collect, aggregate and report on a hitherto unprecedented amount of trade data.

While there are non-trivial differences between the scope and net cast by the SEC (listed equities and equity options only) and ESMA (more comprehensive in terms of asset class coverage and trading both on- or off-venue), they nevertheless present similar challenges.

For example, firms must collect data from multiple, often disparate systems, starting from their execution management system/order management system through to middle layers all the way to their back-office systems.

In addition, data types range from customer personal data to order and trade information through less easily identifiable sources of allocation and other data. Bigger broker-dealers can have various such platforms, both new and legacy, and with complex links between them.

Under these regulations, firms will have to extract the data out of these systems and then aggregate it, consolidate it, submit it to a processor or trade repository and retain it; retention requirements range from five to seven years, depending on the point of trading.

And just when all of that work seems to be done, the processor/repository may find issues with the data and come back with requests for corrections.

This whole process is effectively producing an aggregated meta-data set that has never existed in the history of financial services. Some of the biggest broker-dealers have tried to build this themselves, with varying degrees of success. Now it’s becoming reality – and it’s all due to regulation.

But this isn’t just “big” data; it’s enormous data. So while there’s clearly value in this data, how can firms capitalize on that value, and where do they start?

The first piece of the puzzle is business intelligence (BI) tools. Users can ask questions of the data. Some could be as simple as, “How many shares of ABC stock are we trading across the firm?”

Yet even a straightforward question like that might be hard to answer. What of more complicated questions such as correlated exposures or historical or emerging patterns between traders? User intuitive BI tools can help firms discover meaningful insights quickly and effectively.

The natural extension to that is to create a machine learning environment for meta data crunching that adds true value to the business. For example, deep learning algorithms may detect and identify trends that human operators just wouldn’t look for.

Such an approach is particularly appealing to firms that have allocated budget and staff to machine learning and artificial intelligence but aren’t sure where to start.

At FIS, we’ve built our Protegent CAT solution on the Apache Hadoop framework on the FIS cloud. As a result, it can stream-process billions of market events in a timely, flexible and cost-efficient way for any CAT reporter. Furthermore, Protegent CAT offers powerful analytics tools to discover patterns and correlations of order volumes, profitability, routing and other factors, leading to new business insights and opportunities. Outside of CAT reporting, we are also working with large institutions that are exploring ways in which they can leverage their data lakes using our Protegent Apache Hadoop framework and accompanying analytics.

Are you ready to start? Contact us to discover how our machine learning and AI experts are helping large institutions leverage their data lakes. Email us at