posted on October 20, 2014 14:11
Over the past few months, I have co-hosted Conversations on Big Data, a series of discussions about using analytics in creative and interesting ways. Today’s Conversation is with Lori Walsh, the Chief of the Center for Risk and Quantitative Analytics (the Center) for the Securities Exchange Commission’s (SEC) Division of Enforcement. The SEC has several analytics programs that are structured in a “hub and spoke system.” Lori’s Center sits at the hub to “centralize the information and determine how to share those techniques and tools generally”. Lori says that the “main part of my job entails proactive identification of … violations of the securities laws. And so I focus on data, analytical tools, and techniques to help identify violations … more quickly”.
Previously, Lori ran the Office of Market Intelligence for the Division of Enforcement, and this background prepared her well for her role with the Center. She cited the management of the tips and complaints process as an example of this preparation. SEC receives approximately 20,000 tips annually that are documented, profiled, and then evaluated against several criteria such as credibility, significance, and risk. Lori told me “Seeing all of these tips come in day after day … made me see a pattern.” From these patterns, Lori says she that she now tries to “go into the data and identify things before we get a tip.” “That’s been really critical in how we’ve structured the Center“. She says her method is to use data mining techniques to identify patterns in the data and correlate them to “violative” activity.
Lori shared her three essential elements for a good analytics program.
· Subject Matter Expertise
Data is not an issue for the SEC, which consumes a huge volume of data that is further processed, analyzed, and enhanced by regional offices sitting at the end of the various analytic “spokes” referred to above. However, Lori is very excited about technical advances in data integration. All that data, raw and derived, that SEC consumes is collected at the Center. She does not worry about data quality because she actively manages to it. “A lot of people think we don’t have enough data available.” “I say we’ve got way too much data available to us.” “We’ve got to figure out what data is needed to answer a question.” She goes on to say that “being tripped up by poor-quality data is a slightly different issue.” “You want to get the data as clean as possible.” But then you have to “caveat the output … based on the limitations of the data”.
As for Infrastructure, Lori describes her current integration process as “somewhat laborious and cumbersome”. She says “it’s a way of pulling together pieces of the puzzle … and you are able to see connections among the data putting pieces of the puzzle together.” She is starting to use tools that will not only automate and facilitate the actual integration process, but also “map it for you … using icons or histograms or a timeline so that you can see the data in lots of different ways.” She says that “data visualization is fairly new for us [and] is exciting.”
Lori says that she relies heavily on Subject Matter Experts to supply her with questions to apply to her large repository of data. The Division of Enforcement staff, including attorneys, accountants, and investigators, is well trained and experienced. Lori says “they know what fraud looks like, but they don’t necessarily know how to take that information out of their head and put it into an algorithm or data or analytics.” “We try to get the information out of the experts’ heads, identify patterns, identify data that we can use to apply the patterns to, and then filter the universe of potential behavior to the ones that are most likely to be high-risk.” Lori says she was taught, as an empiricist, the first thing you need is a theory to test.
I asked Lori to tell us her definition of success. She says the “ultimate goal is for the Center to be more efficient, faster at identifying ‘violative’ activity. And if we identify something before we get a tip in, that’s a success.” She cites an example that occurred recently where the Center used a risk-based analytics process to identify a potentially fraudulent offering. It was referred to an investigative group within Enforcement. Two days later a tip came in on the very same offering. “That’s a success for us.”