Earlier this month, Experian joined the nation’s largest community of online lenders at LendIt Fintech USA 2019 in San Francisco, CA to show over 5,000 attendees from 50 countries the ways consumer-permissioned data is changing the credit landscape. Experian Consumer Information Services Group President, Alex Lintner, and FICO Chief Executive Officer, Will Lansing, delivered a joint keynote on the topic of innovation around financial inclusion and credit access. The keynote addressed the analytical developments behind consumer-permissioned data and how it can be leveraged to responsibly and securely extend credit to more consumers. The session was moderated by personal finance expert, Lynnette Khalfani-Cox, from The Money Coach. “Consumer-permissioned data is not a new concept,” said Lintner. “All of us are on Facebook, Twitter, and LinkedIn. The information on these platforms is given by consumers. The way we are using consumer-permissioned data extends that concept to credit services.” During the keynote, both speakers highlighted recent company credit innovations. Lansing talked about UltraFICO™, a score that adds bank transaction data with consumer consent to recalibrate an existing FICO® Score, and Lintner discussed the newly launched Experian Boost™, a free, groundbreaking online platform that allows consumers to instantly boost their credit scores by adding telecommunications and utility bill payments to their credit file. “If a consumer feels that the information on their credit files is not complete and that they are not represented holistically as an applicant for a loan, then they can contribute their own data by giving access to tradelines, such as utility and cell phone payments,” explained Lintner. There are approximately 100 million people in America who do not have access to fair credit, because they are subprime, have thin credit files, or have no lending history. Subprime consumers will spend an additional $200,000 over their lifetime on the average loan portfolio. Credit innovations, such as Experian Boost and UltraFICO not only give consumers greater control and access to quality credit, but also expand the population that lenders can responsibly serve while providing a differentiated and competitive advantage. “Every day, our data is used in one million credit decisions; 350 million per year,” said Lintner. “When our data is being used, it represents the consumers’ credit reputation. It needs to be accurate, it needs to be timely and it needs to be complete.” Following the keynote, Experian, FICO, Finicity and Deserve joined forces in a breakout panel to dive deeper into the concept of consumer-permissioned data. Panel speakers included Greg Wright, Chief Product Officer at Experian’s Consumer Information Services; Dave Shellenberger, Vice President of Product Management at FICO; Nick Thomas, Co-Founder, President and Chief Technology Officer at Finicity, and Kalpesh Kapadia, Chief Executive Officer at Deserve. “As Alex described in today’s keynote, consumer-permissioned data is not a new concept,” said Greg Wright. “The difference here is that Experian, FICO and Finicity are applying this concept to credit services, working together to bring consumer-permissioned data to mass scale, so that lenders can reach more people while taking on less risk.” For an inside look at Experian and FICO’s joint keynote, watch the video below, or visit Experian.com and boost your own credit score.
At Experian, we know that fintechs don’t just need big data – they need the best data, and they need that data as quickly as possible. Successfully delivering on this need is one of the many reasons we’re proud to be selected as a Fintech Breakthrough Award winner for the second consecutive year. The Fintech Breakthrough Awards is the premier awards program founded to recognize fintech innovators, leaders and visionaries from around the world. The 2019 Fintech Breakthrough Award program received more than 3,500 nominations from across the globe. Last year, Experian took home the Consumer Lending Innovation Award for our Text for Credit Solution – a powerful tool for providing consumers the convenience to securely bypass the standard-length ‘pen & paper’ or keystroke intensive credit application process while helping lenders make smart, fraud protected lending decisions. This year, we are excited to announce that Experian’s Ascend Analytical Sandbox™ has been selected as winner in the Best Overall Analytics Platform category. “We are thrilled to be recognized by Fintech Breakthrough for the second year in a row and that our Ascend Analytical Sandbox has been recognized as the best overall analytics platform in 2019,” said Vijay Mehta, Experian’s Chief Innovation Officer. “We understand the challenges fintechs face - to stay ahead of constantly changing market conditions and customer demands,” said Mehta. “The Ascend Analytical Sandbox is the answer, giving financial institutions the fastest access to the freshest data so they can leverage the most out of their analytics and engage their customers with the best decisions.” Debuting in 2018, Experian’s Ascend Analytical Sandbox is a first-to-market analytics environment that moved companies beyond just business intelligence and data visualization to data insights and answers they could actually use. In addition to thousands of scores and attributes, the Ascend Analytical Sandbox offers users industry-standard analytics and data visualization tools like SAS, R Studio, Python, Hue and Tableau, all backed by a network of industry and support experts to drive the most answers and value out of their data and analytics. Less than a year post-launch, the groundbreaking solution is being used by 15 of the top financial institutions globally. Early Access Program Experian is committed to developing leading-edge solutions to power fintechs, knowing they are some of the best innovators in the marketplace. Fintechs are changing the industry, empowering consumers and driving customer engagement like never before. To connect fintechs with the competitive edge, Experian launched an Early Access Program, which fast-tracks onboarding to an exclusive market test of the Ascend Analytical Sandbox. In less than 10 days, our fintech partners can leverage the power, breadth and depth of Experian’s data, attributes and models. With endless use cases and easy delivery of portfolio monitoring, benchmarking, wallet share analysis, model development, and market entry, the Ascend Analytical Sandbox gives fintechs the fastest access to the freshest data so they can leverage the most out of their analytics and engage their customers with the best decisions. A Game Changer for the Industry In a recent IDC customer spotlight, OneMain Financial reported the Ascend Analytical Sandbox had helped them reduce their archive process from a few months to 1-2 weeks, a nearly 75% time savings. “Imagine having the ability to have access to every single tradeline for every single person in the United States for the past almost 20 years and have your own tradelines be identified among them. Imagine what that can do,” said OneMain Financial’s senior managing director and head of model development. For more information, download the Ascend Analytical Sandbox™ Early Access Program product sheet here, or visit Experian.com/Sandbox.
Alternative credit data and trended data each have advantages to lenders and financial institutions. Is there such a thing as the MVD (Most Valuable Data)? Get Started Today When it comes to the big game, we can all agree the score is the last thing standing; however, how the two teams arrived at that score is arguably the more important part of the story. The same goes for consumers’ credit scores. The teams’ past records and highlight reels give insight into their actual past performance, while game day factors beyond the stat sheets – think weather, injury rehab and personal lives – also play a part. Similarly, consumers’ credit scores according to the traditional credit file may be the dependable source for determining credit worthiness. But, while the traditional credit file is extensive, there is a playbook of other, additional information you can arm yourself with for easier, faster and better lending decisions. We’ve outlined what you need to create a win-win data strategy: Alternative credit data and trended data each have unique advantages over traditional credit data for both lenders and consumers alike. How do you formulate a winning strategy? By making sure you have both powerhouses on your roster. The results? Better than that game-winning touchdown and hoisting the trophy above your head – universe expansion and the ability to lend deeper. Get Started Today
Are You #TeamTrended or #TeamAlternative? There’s no such thing as too much data, but when put head to head, differences between the data sets are apparent. Which team are you on? Here’s what we know: With the entry and incorporation of alternative credit data into the data arena, traditional credit data is no longer the sole determinant for credit worthiness, granting more people credit access. Built for the factors influencing financial health today, alternative credit data essentially fills the gaps of the traditional credit file, including alternative financial services data, rental payments, asset ownership, utility payments, full file public records, and consumer-permissioned data – all FCRA-compliant data. Watch this video to see more: Trended data, on the other hand shows actual, historical credit data. It provides key balance and payment data for the previous 24 months to allow lenders to leverage behavior trends to determine how individuals are utilizing their credit. Different splices of that information reveal particular behavior patterns, empowering lenders to then act on that behavior. Insights include a consumer’s spend on all general purpose credit and charge cards and predictive metrics that identify consumers who will be in the market for a specific type of credit product. In the head-to-head between alternative credit data and trended data, both have clear advantages. You need both on your roster to supplement traditional credit data and elevate your game to the next level when it comes to your data universe. Compared to the traditional credit file, alternative credit data can reveal information differentiating two consumers. In the examples below, both consumers have moderate limits and have making timely credit card payments according to their traditional credit reports. However, alternative data gives insight into their alternative financial services information. In Example 1, Robert Smith is currently past due on his personal loan, whereas Michelle Lee in Example 2 is current on her personal loan, indicating she may be the consumer with stronger creditworthiness. Similarly, trended data reveals that all credit scores are not created equal. Here is an example of how trended data can differentiate two consumers with the same score. Different historical trends can show completely different trajectories between seemingly similar consumers. While the traditional credit score is a reliable indication of a consumer’s creditworthiness, it does not offer the full picture. What insights are you missing out on? Go to Infographic Get Started Today
From the time we wake up to the minute our head hits the pillow, we make about 35,000 conscious and unconscious decisions a day. That’s a lot of processing in a 24-hour period. As part of that process, some decisions are intuitive: we’ve been in a situation before and know what to expect. Our minds make shortcuts to save time for the tasks that take a lot more brainpower. As for new decisions, it might take some time to adjust, weigh all the information and decide on a course of action. But after the new situation presents itself over and over again, it becomes easier and easier to process. Similarly, using traditional data is intuitive. Lenders have been using the same types of data in consumer credit worthiness decisions for decades. Throwing in a new data asset might take some getting used to. For those who are wondering whether to use alternative credit data, specifically alternative financial services (AFS) data, here are some facts to make that decision easier. In a recent webinar, Experian’s Vice President of Analytics, Michele Raneri, and Data Scientist, Clara Gharibian, shed some light on AFS data from the leading source in this data asset, Clarity Services. Here are some insights and takeaways from that event. What is Alternative Financial Services? A financial service provided outside of traditional banking institutions which include online and storefront, short-term unsecured, short-term installment, marketplace, car title and rent-to-own. As part of the digital age, many non-traditional loans are also moving online where consumers can access credit with a few clicks on a website or in an app. AFS data provides insight into each segment of thick to thin-file credit history of consumers. This data set, which holds information on more than 62 million consumers nationwide, is also meaningful and predictive, which is a direct answer to lenders who are looking for more information on the consumer. In fact, in a recent State of Alternative Credit Data whitepaper, Experian found that 60 percent of lenders report that they decline more than 5 percent of applications because they have insufficient information to make a loan decision. The implications of having more information on that 5 percent would make a measurable impact to the lender and consumer. AFS data is also meaningful and predictive. For example, inquiry data is useful in that it provides insight into the alternative financial services industry. There are also more stability indicators in this data such as number of employers, unique home phone, and zip codes. These interaction points indicate the stability or volatility of a consumer which may be helpful in decision making during the underwriting stage. AFS consumers tend to be younger and less likely to be married compared to the U.S. average and traditional credit data on File OneSM . These consumers also tend to have lower VantageScores, lower debt, higher bad rates and much lower spend. These statistics lend themselves to seeing the emerging consumer; millennials, immigrants with little to no credit history and also those who may have been subprime or near prime consumers who are demonstrating better credit management. There also may be older consumers who may have not engaged in traditional credit history in a while or those who have hit a major life circumstance who had nowhere else to turn. Still others who have turned to nontraditional lending may have preferred the experience of online lending and did not realize that many of these trades do not impact their traditional credit file. Regardless of their individual circumstances, consumers who leverage alternative financial services have historically had one thing in common: their performance in these products did nothing to further their access to traditional, and often lower cost, sources of credit. Through Experian’s acquisition and integration of Clarity Services, the nation’s largest alternative finance credit bureau, lenders can gain access to powerful and predictive supplemental credit data that better detect risk while benefiting consumers with a more complete credit history. Alternative finance data can be used across the lending cycle from prospecting to decisioning and account review to collections. Alternative data gives lenders an expanded view of consumer behavior which enables more complete and confident lending decisions. Find out more about Experian’s alternative credit data: www.experian.com/alternativedata.
With scarce resources and limited experience available in the data science field, a majority of organizations are partnering with outside firms to fill gaps within their teams. A report compiled by Hexa Research found that the data analytics outsourcing market is set to expand at a compound annual growth rate of 30 percent between 2016 and 2024, reaching annual revenues of more than $6 billion. With data science becoming a necessity for success, outsourcing these specific skills will be the way of the future. When working with outside firms, you may be given the option between offshore and onshore resources. But how do you decide? Let’s discuss a few things you can consider. Offshore A well-known benefit of using offshore resources is lower cost. Offshore resources provide a larger pool of talent, which includes those who have specific analytical skills that are becoming rare in North America. By partnering with outside firms, you also expose your organization to global best practices by learning from external resources who have worked in different industries and locations. If a partner is investing research and development dollars into specific data science technology or new analytics innovations, you can use this knowledge and apply it to your business. With every benefit, however, there are challenges. Time zone differences and language barriers are things to consider if you’re working on a project that requires a large amount of collaboration with your existing team. Security issues need to be addressed differently when using offshore resources. Lastly, reputational risk also can be a concern for your organization. In certain cases, there may be a negative perception — both internally and externally — of moving jobs offshore, so it’s important to consider this before deciding. Onshore While offshore resources can save your organization money, there are many benefits to hiring onshore analytical resources. Many large projects require cross-functional collaboration. If collaboration is key to the projects you’re managing, onshore resources can more easily blend with your existing resources because of time zone similarities, reduced communication barriers and stronger cultural fit into your organization. In the financial services industry, there also are regulatory guidelines to consider. Offshore resources often may have the skills you’re looking for but don’t have a complete understanding of our regulatory landscape, which can lead to larger problems in the future. Hiring resources with this type of knowledge will help you conduct the analysis in a compliant manner and reduce your overall risk. All of the above Many of our clients — and we ourselves — find that an all-of-the-above approach is both effective and efficient. In certain situations, some timeline reductions can be made by having both onshore and offshore resources working on a project. Teams can include up to three different groups: Local resources who are closest to the client and the problem Resources in a nearby foreign country whose time zone overlaps with that of the local resources More analytical team members around the world whose tasks are accomplished somewhat more independently Carefully focusing on how the partnership works and how the external resources are managed is even more important than where they are located. Read 5 Secrets to Outsourcing Data Science Successfully to help you manage your relationship with your external partner. If your next project calls for experienced data scientists, Experian® can help. Our Analytics on DemandTM service provides senior-level analysts, either offshore or onshore, who can help with analytical data science and modeling work for your organization.
What if you had an opportunity to boost your credit score with a snap of your fingers? With the announcement of Experian BoostTM, this will soon be the new reality. As part of an increasingly customizable and instant consumer reality in the marketplace, Experian is innovating in the space of credit to allow consumers to contribute information to their credit profiles via access to their online bank accounts. For decades, Experian has been a leader in educating consumers on credit: what goes into a credit score, how to raise it and how to maintain it. Now, as part of our mission to be the consumer’s bureau, Experian is ushering in a new age of consumer empowerment with Boost. Through an already established and full-fledged suite of consumer products, Experian Boost is the next generation offering a free online platform that places the control in the consumers’ hands to influence their credit scores. The platform will feature a sign-in verification, during which consumers grant read-only permission for Experian Boost to connect to their online bank accounts to identify utility and telecommunications payments. After they verify their data and confirm that they want the account information added to their credit file, consumers will receive an instant updated FICO® Score. The history behind credit information spans several centuries from a group of London tailors swapping information on customers to keeping credit files on index cards being read out to subscribers over the telephone. Even with the evolution of the credit industry being very much in the digital age today, Experian Boost is a significant step forward for a credit bureau. This new capability educates the consumer on what types of payment behavior impacts their credit score while also empowering them to add information to change it. This is a big win-win for consumers and lenders alike. As Experian is taking the next big step as a traditional credit bureau, adding these data sources is a new and innovative way to help consumers gain access to the quality credit they deserve as well as promoting fair and responsible lending to the industry. Early analysis of Experian’s Boost impact on the U.S. consumer credit scores showed promising results. Here’s a snapshot of some of those findings: These statistics provide an encouraging vision into the future for all consumers, especially for those who have a limited credit history. The benefit to lenders in adding these new data points will be a more complete view on the consumer to make more informed lending decisions. Only positive payment histories will be collected through the platform and consumers can elect to remove the new data at any time. Experian Boost will be available to all credit active adults in early 2019, but consumers can visit www.experian.com/boost now to register for early access. By signing up for a free Experian membership, consumers will receive a free credit report immediately, and will be one of the first to experience the new platform. Experian Boost will apply to most leading consumer credit scores used by lenders. To learn more about the platform visit www.experian.com/boost.
“We don’t know what we don’t know.” It’s a truth that seems to be on the minds of just about every financial institution these days. The market, not-to-mention the customer base, seems to be evolving more quickly now than ever before. Mergers, acquisitions and partnerships, along with new competitors entering the space, are a daily headline. Customers expect the same seamless user experience and instant gratification they’ve come to expect from companies like Amazon in just about every interaction they have, including with their financial institutions. Broadly, financial institutions have been slow to respond both in the products they offer their customers and prospects, and in how they present those products. Not surprisingly, only 26% of customers feel like their financial institutions understand and appreciate their needs. So, it’s not hard to see why there might be uncertainty as to how a financial institution should respond or what they should do next. But what if you could know what you don’t know about your customer and industry data? Sound too good to be true? It’s not—it’s exactly what Experian’s Ascend Analytical Sandbox was built to do. “At OneMain we’ve used Sandbox for a lot of exploratory analysis and feature development,” said Ryland Ely, a modeler at Experian partner client, OneMain Financial and a Sandbox user. For example, “we’ve used a loan amount model built on Sandbox data to try and flag applications where we might be comfortable with the assigned risk grade but we’re concerned we might be extending too much or too little credit,” he said. The first product built on Experian’s big data platform, Ascend, the Analytical Sandbox is an analytics environment that can have enterprise-wide impact. It provides users instant access to near real-time customer data, actionable analytics and intelligence tools, along with a network of industry and support experts to drive the most value out of their data and analytics. Developed with scalability, flexibility, efficiency and security at top-of-mind, the Sandbox is a hybrid-cloud system that leverages the high availability and security of Amazon Web Services. This eliminates the need, time and infrastructure costs associated with creating an internally hosted environment. Additionally, our web-based interface speeds access to data and tools in your dedicated Sandbox all behind the protection of Experian’s firewall. In addition to being supported by a revolutionized tech stack backed by an $825 million annual investment, Sandbox enables use of industry-leading business intelligence tools like SAS, RStudio, H2O, Python, Hue and Tableau. Where the Ascend Sandbox really shines is in the amount and quality of the data that’s put into it. As the largest, global information services provider, the Sandbox brings the full power of Experian’s 17+ years of full-file historical tradeline data, boasting a data accuracy rate of 99.9%. The Sandbox also allows users the option to incorporate additional data sets including commercial small business data and soon real estate data, among others. Alternative data assets add to the 50 million consumers who use some sort of financial service, in addition to rental and utility payments. In addition to including Experian’s data on the 220+ million credit-active consumers, small business and other data sets, the Sandbox also allows companies to integrate their own customer data into the system. All data is depersonalized and pinned to allow companies to fully leverage the value of Experian’s patented attributes and scores and models. Ascend Sandbox allows companies to mine the data for business intelligence to define strategy and translate those findings into data visualizations to communicate and win buy-in throughout their organization. But here is where customers are really identifying the value in this big data solution, taking those business intelligence insights and being able to take the resulting models and strategies from the Sandbox directly into a production environment. After all, amassing data is worthless unless you’re able to use it. That’s why 15 of the top financial institutions globally are using the Experian Ascend Sandbox for more than just benchmarking and data visualization but also risk modeling, score migration, share of wallet, market entry, cross-sell and much more. Moreover, clients are seeing time-savings, deeper insights and reduced compliance concerns as a result of consolidating their production data and development platform inside Sandbox. “Sandbox is often presented as a tool for visualization or reporting, sort of creating summary statistics of what’s going on in the market. But as a modeler, my perspective is that it has application beyond just those things,” said Ely. To learn more about the Experian Ascend Analytical Sandbox and hear more about how OneMain Financial is getting value out of the Sandbox, watch this on-demand webinar.
Your model is only as good as your data, right? Actually, there are many considerations in developing a sound model, one of which is data. Yet if your data is bad or dirty or doesn’t represent the full population, can it be used? This is where sampling can help. When done right, sampling can lower your cost to obtain data needed for model development. When done well, sampling can turn a tainted and underrepresented data set into a sound and viable model development sample. First, define the population to which the model will be applied once it’s finalized and implemented. Determine what data is available and what population segments must be represented within the sampled data. The more variability in internal factors — such as changes in marketing campaigns, risk strategies and product launches — and external factors — such as economic conditions or competitor presence in the marketplace — the larger the sample size needed. A model developer often will need to sample over time to incorporate seasonal fluctuations in the development sample. The most robust samples are pulled from data that best represents the full population to which the model will be applied. It’s important to ensure your data sample includes customers or prospects declined by the prior model and strategy, as well as approved but nonactivated accounts. This ensures full representation of the population to which your model will be applied. Also, consider the number of predictors or independent variables that will be evaluated during model development, and increase your sample size accordingly. When it comes to spotting dirty or unacceptable data, the golden rule is know your data and know your target population. Spend time evaluating your intended population and group profiles across several important business metrics. Don’t underestimate the time needed to complete a thorough evaluation. Next, select the data from the population to aptly represent the population within the sampled data. Determine the best sampling methodology that will support the model development and business objectives. Sampling generates a smaller data set for use in model development, allowing the developer to build models more quickly. Reducing the data set’s size decreases the time needed for model computation and saves storage space without losing predictive performance. Once the data is selected, weights are applied so that each record appropriately represents the full population to which the model will be applied. Several traditional techniques can be used to sample data: Simple random sampling — Each record is chosen by chance, and each record in the population has an equal chance of being selected. Random sampling with replacement — Each record chosen by chance is included in the subsequent selection. Random sampling without replacement — Each record chosen by chance is removed from subsequent selections. Cluster sampling — Records from the population are sampled in groups, such as region, over different time periods. Stratified random sampling — This technique allows you to sample different segments of the population at different proportions. In some situations, stratified random sampling is helpful in selecting segments of the population that aren’t as prevalent as other segments but are equally vital within the model development sample. Learn more about how Experian Decision Analytics can help you with your custom model development needs.
I believe it was George Bernard Shaw that once said something along the lines of, “If economists were laid end-to-end, they’d never come to a conclusion, at least not the same conclusion.” It often feels the same way when it comes to big data analytics around customer behavior. As you look at new tools to put your customer insights to work for your enterprise, you likely have questions coming from across your organization. Models always seem to take forever to develop, how sure are we that the results are still accurate? What data did we use in this analysis; do we need to worry about compliance or security? To answer these questions and in an effort to best utilize customer data, the most forward-thinking financial institutions are turning to analytical environments, or sandboxes, to solve their big data problems. But what functionality is right for your financial institution? In your search for a sandbox solution to solve the business problem of big data, make sure you keep these top four features in mind. Efficiency: Building an internal data archive with effective business intelligence tools is expensive, time-consuming and resource-intensive. That’s why investing in a sandbox makes the most sense when it comes to drawing the value out of your customer data.By providing immediate access to the data environment at all times, the best systems can reduce the time from data input to decision by at least 30%. Another way the right sandbox can help you achieve operational efficiencies is by direct integration with your production environment. Pretty charts and graphs are great and can be very insightful, but the best sandbox goes beyond just business intelligence and should allow you to immediately put models into action. Scalability and Flexibility: In implementing any new software system, scalability and flexibility are key when it comes to integration into your native systems and the system’s capabilities. This is even more imperative when implementing an enterprise-wide tool like an analytical sandbox. Look for systems that offer a hosted, cloud-based environment, like Amazon Web Services, that ensures operational redundancy, as well as browser-based access and system availability.The right sandbox will leverage a scalable software framework for efficient processing. It should also be programming language agnostic, allowing for use of all industry-standard programming languages and analytics tools like SAS, R Studio, H2O, Python, Hue and Tableau. Moreover, you shouldn’t have to pay for software suites that your analytics teams aren’t going to use. Support: Whether you have an entire analytics department at your disposal or a lean, start-up style team, you’re going to want the highest level of support when it comes to onboarding, implementation and operational success. The best sandbox solution for your company will have a robust support model in place to ensure client success. Look for solutions that offer hands-on instruction, flexible online or in-person training and analytical support. Look for solutions and data partners that also offer the consultative help of industry experts when your company needs it. Data, Data and More Data: Any analytical environment is only as good as the data you put into it. It should, of course, include your own client data. However, relying exclusively on your own data can lead to incomplete analysis, missed opportunities and reduced impact. When choosing a sandbox solution, pick a system that will include the most local, regional and national credit data, in addition to alternative data and commercial data assets, on top of your own data.The optimum solutions will have years of full-file, archived tradeline data, along with attributes and models for the most robust results. Be sure your data partner has accounted for opt-outs, excludes data precluded by legal or regulatory restrictions and also anonymizes data files when linking your customer data. Data accuracy is also imperative here. Choose a big data partner who is constantly monitoring and correcting discrepancies in customer files across all bureaus. The best partners will have data accuracy rates at or above 99.9%. Solving the business problem around your big data can be a daunting task. However, investing in analytical environments or sandboxes can offer a solution. Finding the right solution and data partner are critical to your success. As you begin your search for the best sandbox for you, be sure to look for solutions that are the right combination of operational efficiency, flexibility and support all combined with the most robust national data, along with your own customer data. Are you interested in learning how companies are using sandboxes to make it easier, faster and more cost-effective to drive actionable insights from their data? Join us for this upcoming webinar. Register for the Webinar
This is an exciting time to work in big data analytics. Here at Experian, we have more than 2 petabytes of data in the United States alone. In the past few years, because of high data volume, more computing power and the availability of open-source code algorithms, my colleagues and I have watched excitedly as more and more companies are getting into machine learning. We’ve observed the growth of competition sites like Kaggle, open-source code sharing sites like GitHub and various machine learning (ML) data repositories. We’ve noticed that on Kaggle, two algorithms win over and over at supervised learning competitions: If the data is well-structured, teams that use Gradient Boosting Machines (GBM) seem to win. For unstructured data, teams that use neural networks win pretty often. Modeling is both an art and a science. Those winning teams tend to be good at what the machine learning people call feature generation and what we credit scoring people called attribute generation. We have nearly 1,000 expert data scientists in more than 12 countries, many of whom are experts in traditional consumer risk models — techniques such as linear regression, logistic regression, survival analysis, CART (classification and regression trees) and CHAID analysis. So naturally I’ve thought about how GBM could apply in our world. Credit scoring is not quite like a machine learning contest. We have to be sure our decisions are fair and explainable and that any scoring algorithm will generalize to new customer populations and stay stable over time. Increasingly, clients are sending us their data to see what we could do with newer machine learning techniques. We combine their data with our bureau data and even third-party data, we use our world-class attributes and develop custom attributes, and we see what comes out. It’s fun — like getting paid to enter a Kaggle competition! For one financial institution, GBM armed with our patented attributes found a nearly 5 percent lift in KS when compared with traditional statistics. At Experian, we use Extreme Gradient Boosting (XGBoost) implementation of GBM that, out of the box, has regularization features we use to prevent overfitting. But it’s missing some features that we and our clients count on in risk scoring. Our Experian DataLabs team worked with our Decision Analytics team to figure out how to make it work in the real world. We found answers for a couple of important issues: Monotonicity — Risk managers count on the ability to impose what we call monotonicity. In application scoring, applications with better attribute values should score as lower risk than applications with worse values. For example, if consumer Adrienne has fewer delinquent accounts on her credit report than consumer Bill, all other things being equal, Adrienne’s machine learning score should indicate lower risk than Bill’s score. Explainability — We were able to adapt a fairly standard “Adverse Action” methodology from logistic regression to work with GBM. There has been enough enthusiasm around our results that we’ve just turned it into a standard benchmarking service. We help clients appreciate the potential for these new machine learning algorithms by evaluating them on their own data. Over time, the acceptance and use of machine learning techniques will become commonplace among model developers as well as internal validation groups and regulators. Whether you’re a data scientist looking for a cool place to work or a risk manager who wants help evaluating the latest techniques, check out our weekly data science video chats and podcasts.
If your company is like many financial institutions, it’s likely the discussion around big data and financial analytics has been an ongoing conversation. For many financial institutions, data isn’t the problem, but rather what could or should be done with it. Research has shown that only about 30% of financial institutions are successfully leveraging their data to generate actionable insights, and customers are noticing. According to a recent study from Capgemini, 30% of US customers and 26% of UK customers feel like their financial institutions understand their needs. No matter how much data you have, it’s essentially just ones and zeroes if you’re not using it. So how do banks, credit unions, and other financial institutions who capture and consume vast amounts of data use that data to innovate, improve the customer experience and stay competitive? The answer, you could say, is written in the sand. The most forward-thinking financial institutions are turning to analytical environments, also known as a sandbox, to solve the business problem of big data. Like the name suggests, a sandbox is an environment that contains all the materials and tools one might need to create, build, and collaborate around their data. A sandbox gives data-savvy banks, credit unions and FinTechs access to depersonalized credit data from across the country. Using custom dashboards and data visualization tools, they can manipulate the data with predictive models for different micro and macro-level scenarios. The added value of a sandbox is that it becomes a one-stop shop data tool for the entire enterprise. This saves the time normally required in the back and forth of acquiring data for a specific to a project or particular data sets. The best systems utilize the latest open source technology in artificial intelligence and machine learning to deliver intelligence that can inform regional trends, consumer insights and highlight market opportunities. From industry benchmarking to market entry and expansion research and campaign performance to vintage analysis, reject inferencing and much more. An analytical sandbox gives you the data to create actionable analytics and insights across the enterprise right when you need it, not months later. The result is the ability to empower your customers to make financial decisions when, where and how they want. Keeping them happy keeps your financial institution relevant and competitive. Isn’t it time to put your data to work for you? Learn more about how Experian can solve your big data problems. >> Interested to see a live demo of the Ascend Sandbox? Register today for our webinar “Big Data Can Lead to Even Bigger ROI with the Ascend Sandbox.”
Machine learning (ML), the newest buzzword, has swept into the lexicon and captured the interest of us all. Its recent, widespread popularity has stemmed mainly from the consumer perspective. Whether it’s virtual assistants, self-driving cars or romantic matchmaking, ML has rapidly positioned itself into the mainstream. Though ML may appear to be a new technology, its use in commercial applications has been around for some time. In fact, many of the data scientists and statisticians at Experian are considered pioneers in the field of ML, going back decades. Our team has developed numerous products and processes leveraging ML, from our world-class consumer fraud and ID protection to producing credit data products like our Trended 3DTM attributes. In fact, we were just highlighted in the Wall Street Journal for how we’re using machine learning to improve our internal IT performance. ML’s ability to consume vast amounts of data to uncover patterns and deliver results that are not humanly possible otherwise is what makes it unique and applicable to so many fields. This predictive power has now sparked interest in the credit risk industry. Unlike fraud detection, where ML is well-established and used extensively, credit risk modeling has until recently taken a cautionary approach to adopting newer ML algorithms. Because of regulatory scrutiny and perceived lack of transparency, ML hasn’t experienced the broad acceptance as some of credit risk modeling’s more utilized applications. When it comes to credit risk models, delivering the most predictive score is not the only consideration for a model’s viability. Modelers must be able to explain and detail the model’s logic, or its “thought process,” for calculating the final score. This means taking steps to ensure the model’s compliance with the Equal Credit Opportunity Act, which forbids discriminatory lending practices. Federal laws also require adverse action responses to be sent by the lender if a consumer’s credit application has been declined. This requires the model must be able to highlight the top reasons for a less than optimal score. And so, while ML may be able to deliver the best predictive accuracy, its ability to explain how the results are generated has always been a concern. ML has been stigmatized as a “black box,” where data mysteriously gets transformed into the final predictions without a clear explanation of how. However, this is changing. Depending on the ML algorithm applied to credit risk modeling, we’ve found risk models can offer the same transparency as more traditional methods such as logistic regression. For example, gradient boosting machines (GBMs) are designed as a predictive model built from a sequence of several decision tree submodels. The very nature of GBMs’ decision tree design allows statisticians to explain the logic behind the model’s predictive behavior. We believe model governance teams and regulators in the United States may become comfortable with this approach more quickly than with deep learning or neural network algorithms. Since GBMs are represented as sets of decision trees that can be explained, while neural networks are represented as long sets of cryptic numbers that are much harder to document, manage and understand. In future blog posts, we’ll discuss the GBM algorithm in more detail and how we’re using its predictability and transparency to maximize credit risk decisioning for our clients.
The August 2018 LinkedIn Workforce Report states some interesting facts about data science and the current workforce in the United States. Demand for data scientists is off the charts, but there is a data science skills shortage in almost every U.S. city — particularly in the New York, San Francisco and Los Angeles areas. Nationally, there is a shortage of more than 150,000 people with data science skills. One way companies in financial services and other industries have coped with the skills gap in analytics is by using outside vendors. A 2017 Dun & Bradstreet and Forbes survey reported that 27 percent of respondents cited a skills gap as a major obstacle to their data and analytics efforts. Outsourcing data science work makes it easier to scale up and scale down as needs arise. But surprisingly, more than half of respondents said the third-party work was superior to their in-house analytics. At Experian, we have participated in quite a few outsourced analytics projects. Here are a few of the lessons we’ve learned along the way: Manage expectations: Everyone has their own management style, but to be successful, you must be proactively involved in managing the partnership with your provider. Doing so will keep them aligned with your objectives and prevent quality degradation or cost increases as you become more tied to them. Communication: Creating open and honest communication between executive management and your resource partner is key. You need to be able to discuss what is working well and what isn’t. This will help to ensure your partner has a thorough understanding of your goals and objectives and will properly manage any bumps in the road. Help external resources feel like a part of the team: When you’re working with external resources, either offshore or onshore, they are typically in an alternate location. This can make them feel like they aren’t a part of the team and therefore not directly tied to the business goals of the project. To help bridge the gap, performing regular status meetings via video conference can help everyone feel like a part of the team. Within these meetings, providing information on the goals and objectives of the project is key. This way, they can hear the message directly from you, which will make them feel more involved and provide a clear understanding of what they need to do to be successful. Being able to put faces to names, as well as having direct communication with you, will help external employees feel included. Drive engagement through recognition programs: Research has shown that employees are more engaged in their work when they receive recognition for their efforts. While you may not be able to provide a monetary award, recognition is still a big driver for engagement. It can be as simple as recognizing a job well done during your video conference meetings, providing certificates of excellence or sending a simple thank-you card to those who are performing well. Either way, taking the extra time to make your external workforce feel appreciated will produce engaged resources that will help drive your business goals forward. Industry training: Your external resources may have the necessary skills needed to perform the job successfully, but they may not have specific industry knowledge geared towards your business. Work with your partner to determine where they have expertise and where you can work together to providing training. Ensure your external workforce will have a solid understanding of the business line they will be supporting. If you’ve decided to augment your staff for your next big project, Experian® can help. Our Analytics on DemandTM service provides senior-level analysts, either onshore or offshore, who can help with analytical data science and modeling work for your organization.
As I mentioned in my previous blog, model validation is an essential step in evaluating a recently developed predictive model’s performance before finalizing and proceeding with implementation. An in-time validation sample is created to set aside a portion of the total model development sample so the predictive accuracy can be measured on a data sample not used to develop the model. However, if few records in the target performance group are available, splitting the total model development sample into the development and in-time validation samples will leave too few records in the target group for use during model development. An alternative approach to generating a validation sample is to use a resampling technique. There are many different types and variations of resampling methods. This blog will address a few common techniques. Jackknife technique — An iterative process whereby an observation is removed from each subsequent sample generation. So if there are N number of observations in the data, jackknifing calculates the model estimates on N - 1 different samples, with each sample having N - 1 observations. The model then is applied to each sample, and an average of the model predictions across all samples is derived to generate an overall measure of model performance and prediction accuracy. The jackknife technique can be broadened to a group of observations removed from each subsequent sample generation while giving equal opportunity for inclusion and exclusion to each observation in the data set. K-fold cross-validation — Generates multiple validation data sets from the holdout sample created for the model validation exercise, i.e., the holdout data is split into K subsets. The model then is applied to the K validation subsets, with each subset held out during the iterative process as the validation set while the model scores the remaining K-1 subsets. Again, an average of the predictions across the multiple validation samples is used to create an overall measure of model performance and prediction accuracy. Bootstrap technique — Generates subsets from the full model development data sample, with replacement, producing multiple samples generally of equal size. Thus, with a total sample size of N, this technique generates N random samples such that a single observation can be present in multiple subsets while another observation may not be present in any of the generated subsets. The generated samples are combined into a simulated larger data sample that then can be split into a development and an in-time, or holdout, validation sample. Before selecting a resampling technique, it’s important to check and verify data assumptions for each technique against the data sample selected for your model development, as some resampling techniques are more sensitive than others to violations of data assumptions. Learn more about how Experian Decision Analytics can help you with your custom model development.