In 2011, data scientists and credit risk managers finally found an appropriate analogy to explain what we do for a living. “You know Moneyball? What Paul DePodesta and Billy Beane did for the Oakland A’s, I do for XYZ Bank.” You probably remember the story: Oakland had to squeeze the most value out of its limited budget for hiring free agents, so it used analytics — the new baseball “sabermetrics” created by Bill James — to make data-driven decisions that were counterintuitive to the experienced scouts. Michael Lewis told the story in a book that was an incredible bestseller and led to a hit movie. The year after the movie was made, Harvard Business Review declared that data science was “the sexiest job of the 21st century.” Coincidence? The importance of data Moneyball emphasized the recognition, through sabermetrics, that certain players’ abilities had been undervalued. In Travis Sawchik’s bestseller Big Data Baseball: Math, Miracles, and the End of a 20-Year Losing Streak, he notes that the analysis would not have been possible without the data. Early visionaries, including John Dewan, began collecting baseball data at games all over the country in a volunteer program called Project Scoresheet. Eventually they were collecting a million data points per season. In a similar fashion, credit data pioneers, such as TRW’s Simon Ramo, began systematically compiling basic credit information into credit files in the 1960s. Recognizing that data quality is the key to insights and decision-making and responding to the demand for objective data, Dewan formed two companies — Sports Team Analysis and Tracking Systems (STATS) and Baseball Info Solutions (BIS). It seems quaint now, but those companies collected and cleaned data using a small army of video scouts with stopwatches. Now data is collected in real time using systems from Pitch F/X and the radar tracking system Statcast to provide insights that were never possible before. It’s hard to find a news article about Game 1 of this year’s World Series that doesn’t discuss the launch angle or exit velocity of Eduardo Núñez’s home run, but just a couple of years ago, neither statistic was even measured. Teams use proprietary biometric data to keep players healthy for games. Even neurological monitoring promises to provide new insights and may lead to changes in the game. Similarly, lenders are finding that so-called “nontraditional data” can open up credit to consumers who might have been unable to borrow money in the past. This includes nontraditional Fair Credit Reporting Act (FCRA)–compliant data on recurring payments such as rent and utilities, checking and savings transactions, and payments to alternative lenders like payday and short-term loans. Newer fintech lenders are innovating constantly — using permissioned, behavioral and social data to make it easier for their customers to open accounts and borrow money. Similarly, some modern banks use techniques that go far beyond passwords and even multifactor authentication to verify their customers’ identities online. For example, identifying consumers through their mobile device can improve the user experience greatly. Some lenders are even using behavioral biometrics to improve their online and mobile customer service practices. Continuously improving analytics Bill James and his colleagues developed a statistic called wins above replacement (WAR) that summarized the value of a player as a single number. WAR was never intended to be a perfect summary of a player’s value, but it’s very convenient to have a single number to rank players. Using the same mindset, early credit risk managers developed credit scores that summarized applicants’ risk based on their credit history at a single point in time. Just as WAR is only one measure of a player’s abilities, good credit managers understand that a traditional credit score is an imperfect summary of a borrower’s credit history. Newer scores, such as VantageScore® credit scores, are based on a broader view of applicants’ credit history, such as credit attributes that reflect how their financial situation has changed over time. More sophisticated financial institutions, though, don’t rely on a single score. They use a variety of data attributes and scores in their lending strategies. Just a few years ago, simply using data to choose players was a novel idea. Now new measures such as defense-independent pitching statistics drive changes on the field. Sabermetrics, once defined as the application of statistical analysis to evaluate and compare the performance of individual players, has evolved to be much more comprehensive. It now encompasses the statistical study of nearly all in-game baseball activities. A wide variety of data-driven decisions Sabermetrics began being used for recruiting players in the 1980’s. Today it’s used on the field as well as in the back office. Big Data Baseball gives the example of the “Ted Williams shift,” a defensive technique that was seldom used between 1950 and 2010. In the world after Moneyball, it has become ubiquitous. Likewise, pitchers alter their arm positions and velocity based on data — not only to throw more strikes, but also to prevent injuries. Similarly, when credit scores were first introduced, they were used only in originations. Lenders established a credit score cutoff that was appropriate for their risk appetite and used it for approving and declining applications. Now lenders are using Experian’s advanced analytics in a variety of ways that the credit scoring pioneers might never have imagined: Improving the account opening experience — for example, by reducing friction online Detecting identity theft and synthetic identities Anticipating bust-out activity and other first-party fraud Issuing the right offer to each prescreened customer Optimizing interest rates Reviewing and adjusting credit lines Optimizing collections Analytics is no substitute for wisdom Data scientists like those at Experian remind me that in banking, as in baseball, predictive analytics is never perfect. What keeps finance so interesting is the inherent unpredictability of the economy and human behavior. Likewise, the play on the field determines who wins each ball game: anything can happen. Rob Neyer’s book Power Ball: Anatomy of a Modern Baseball Game quotes the Houston Astros director of decision sciences: “Sometimes it’s just about reminding yourself that you’re not so smart.”
This is an exciting time to work in big data analytics. Here at Experian, we have more than 2 petabytes of data in the United States alone. In the past few years, because of high data volume, more computing power and the availability of open-source code algorithms, my colleagues and I have watched excitedly as more and more companies are getting into machine learning. We’ve observed the growth of competition sites like Kaggle, open-source code sharing sites like GitHub and various machine learning (ML) data repositories. We’ve noticed that on Kaggle, two algorithms win over and over at supervised learning competitions: If the data is well-structured, teams that use Gradient Boosting Machines (GBM) seem to win. For unstructured data, teams that use neural networks win pretty often. Modeling is both an art and a science. Those winning teams tend to be good at what the machine learning people call feature generation and what we credit scoring people called attribute generation. We have nearly 1,000 expert data scientists in more than 12 countries, many of whom are experts in traditional consumer risk models — techniques such as linear regression, logistic regression, survival analysis, CART (classification and regression trees) and CHAID analysis. So naturally I’ve thought about how GBM could apply in our world. Credit scoring is not quite like a machine learning contest. We have to be sure our decisions are fair and explainable and that any scoring algorithm will generalize to new customer populations and stay stable over time. Increasingly, clients are sending us their data to see what we could do with newer machine learning techniques. We combine their data with our bureau data and even third-party data, we use our world-class attributes and develop custom attributes, and we see what comes out. It’s fun — like getting paid to enter a Kaggle competition! For one financial institution, GBM armed with our patented attributes found a nearly 5 percent lift in KS when compared with traditional statistics. At Experian, we use Extreme Gradient Boosting (XGBoost) implementation of GBM that, out of the box, has regularization features we use to prevent overfitting. But it’s missing some features that we and our clients count on in risk scoring. Our Experian DataLabs team worked with our Decision Analytics team to figure out how to make it work in the real world. We found answers for a couple of important issues: Monotonicity — Risk managers count on the ability to impose what we call monotonicity. In application scoring, applications with better attribute values should score as lower risk than applications with worse values. For example, if consumer Adrienne has fewer delinquent accounts on her credit report than consumer Bill, all other things being equal, Adrienne’s machine learning score should indicate lower risk than Bill’s score. Explainability — We were able to adapt a fairly standard “Adverse Action” methodology from logistic regression to work with GBM. There has been enough enthusiasm around our results that we’ve just turned it into a standard benchmarking service. We help clients appreciate the potential for these new machine learning algorithms by evaluating them on their own data. Over time, the acceptance and use of machine learning techniques will become commonplace among model developers as well as internal validation groups and regulators. Whether you’re a data scientist looking for a cool place to work or a risk manager who wants help evaluating the latest techniques, check out our weekly data science video chats and podcasts.
Electric vehicles are here to stay – and will likely gain market share as costs reduce, travel ranges increase and charging infrastructure grows.
How a business prices its products is a dynamic process that drives customer satisfaction and loyalty, as well as business success. In the digital age, pricing is becoming even more complex. For example, companies like Amazon may revise the price of a hot item several times per day. Dynamic pricing models for consumer financial products can be especially difficult for at least four reasons: A complex regulatory environment. Fair lending concerns. The potential for adverse selection by risky consumers and fraudsters. The direct impact the affordability of a loan may have on both the consumer’s ability to pay it and the likelihood that it will be prepaid. If a lender offered the same interest rate and terms to every customer for the same loan product, low-risk customers would secure better rates elsewhere, and high-risk customers would not. The end result? Only the higher-risk customers would select the product, which would increase losses and reduce profitability. For this reason, the lending industry has established risk-based pricing. This pricing method addresses the above issue, since customers with different risk profiles are offered different rates. But it’s limited. More advanced lenders also understand the price elasticity of customer demand, because there are diverse reasons why customers decide to take up differently priced loans. Customers have different needs and risk profiles, so they react to a loan offer in different ways. Many factors determine a customer’s propensity to take up an offer — for example, the competitive environment and availability of other lenders, how time-critical the decision is, and the loan terms offered. Understanding the customer’s price elasticity allows a business to offer the ideal price to each customer to maximize profitability. Pricing optimization is the superior method assuming the lender has a scientific, data-driven approach to predicting how different customers will respond to different prices. Optimization allows an organization to determine the best offer for each customer to meet business objectives while adhering to financial and operational constraints such as volume, margin and credit risk. The business can access trade-offs between competing objectives, such as maximizing revenue and maximizing volume, and determine the optimal decision to be made for each individual customer to best meet both objectives. In the table below, you can see five benefits lenders realize when they improve their pricing segmentation with an optimization strategy. Interested in learning more about pricing optimization? Click here to download our full white paper, Price optimization in retail consumer lending.
Machine learning (ML), the newest buzzword, has swept into the lexicon and captured the interest of us all. Its recent, widespread popularity has stemmed mainly from the consumer perspective. Whether it’s virtual assistants, self-driving cars or romantic matchmaking, ML has rapidly positioned itself into the mainstream. Though ML may appear to be a new technology, its use in commercial applications has been around for some time. In fact, many of the data scientists and statisticians at Experian are considered pioneers in the field of ML, going back decades. Our team has developed numerous products and processes leveraging ML, from our world-class consumer fraud and ID protection to producing credit data products like our Trended 3DTM attributes. In fact, we were just highlighted in the Wall Street Journal for how we’re using machine learning to improve our internal IT performance. ML’s ability to consume vast amounts of data to uncover patterns and deliver results that are not humanly possible otherwise is what makes it unique and applicable to so many fields. This predictive power has now sparked interest in the credit risk industry. Unlike fraud detection, where ML is well-established and used extensively, credit risk modeling has until recently taken a cautionary approach to adopting newer ML algorithms. Because of regulatory scrutiny and perceived lack of transparency, ML hasn’t experienced the broad acceptance as some of credit risk modeling’s more utilized applications. When it comes to credit risk models, delivering the most predictive score is not the only consideration for a model’s viability. Modelers must be able to explain and detail the model’s logic, or its “thought process,” for calculating the final score. This means taking steps to ensure the model’s compliance with the Equal Credit Opportunity Act, which forbids discriminatory lending practices. Federal laws also require adverse action responses to be sent by the lender if a consumer’s credit application has been declined. This requires the model must be able to highlight the top reasons for a less than optimal score. And so, while ML may be able to deliver the best predictive accuracy, its ability to explain how the results are generated has always been a concern. ML has been stigmatized as a “black box,” where data mysteriously gets transformed into the final predictions without a clear explanation of how. However, this is changing. Depending on the ML algorithm applied to credit risk modeling, we’ve found risk models can offer the same transparency as more traditional methods such as logistic regression. For example, gradient boosting machines (GBMs) are designed as a predictive model built from a sequence of several decision tree submodels. The very nature of GBMs’ decision tree design allows statisticians to explain the logic behind the model’s predictive behavior. We believe model governance teams and regulators in the United States may become comfortable with this approach more quickly than with deep learning or neural network algorithms. Since GBMs are represented as sets of decision trees that can be explained, while neural networks are represented as long sets of cryptic numbers that are much harder to document, manage and understand. In future blog posts, we’ll discuss the GBM algorithm in more detail and how we’re using its predictability and transparency to maximize credit risk decisioning for our clients.
Federal legislation makes verifying an individual’s identity by scanning identity documents during onboarding legal in all 50 states Originally posted on Mitek blog The Making Online Banking Initiation Legal and Easy (MOBILE) Act officially became law on May 24, 2018, authorizing a national standard for banks to scan and retain information from driver’s licenses and identity cards as part of a customer online onboarding process, via smartphone or website. This bill, which was proposed in 2017 with bipartisan support, allows financial institutions to fully deploy mobile technology that can make digital account openings across all states seamless and cost efficient. The MOBILE Act also stipulates that the digital image would be destroyed after account opening to further ensure customer data security. As an additional security measure, section 213 of the act mandates an update to the system to confirm matches of names to social security numbers. “The additional security this process could add for online account origination was a key selling point with the Equifax data breach fresh on everyone’s minds,” Scott Sargent, of counsel in the law firm Baker Donelson’s financial service practice, recently commented on AmericanBanker.com. Read the full article here. Though digital banking and an online onboarding process has already been a best practice for financial institutions in recent years, the MOBILE Act officially overrules any potential state legislation that, up to this point, has not recognized digital images of identity documents as valid. The MOBILE Act states: “This bill authorizes a financial institution to record personal information from a scan, copy, or image of an individual’s driver’s license or personal identification card and store the information electronically when an individual initiates an online request to open an account or obtain a financial product. The financial institution may use the information for the purpose of verifying the authenticity of the driver’s license or identification card, verifying the identity of the individual, or complying with legal requirements.” Why adopt online banking? The recently passed MOBILE Act is a boon for both financial institutions and end users. The legislation: Enables and encourages financial institutions to meet their digital transformation goals Makes the process safe with digital ID verification capabilities and other security measures Reduces time, manual Know Your Customer (KYC) duties and costs to financial institutions for onboarding new customers Provides the convenient, on-demand experience that customers want and expect The facts: 61% of people use their mobile phone to carry out banking activity.1 77% of Americans have smartphones.2 50 million consumers who are unbanked or underbanked use smartphones.3 The MOBILE Act doesn’t require any regulatory implementation. Banks can access this real-time electronic process directly or through vendors. Read all you need to know about the MOBILE Act here. Find out more about a better way to manage fraud and identity services. References 1Mobile Ecosystem Forum, MEF Mobile Money Report (https://mobileecosystemforum.com/mobile-money-report/), Feb. 5, 2018. 2Pew Research Center, Mobile Fact Sheet (http://www.pewinternet.org/fact-sheet/mobile/), Jan. 30, 2017. 3The Federal Reserve System, Consumers and Mobile Financial Services 2015 (https://www.federalreserve.gov/econresdata/consumers-and-mobile-financial-services-report-201503.pdf), March 2015.
With credit card openings and usage increasing, now is the time to make sure your financial institution is optimizing its credit card portfolio. Here are some insights on credit card trends: 51% of consumers obtained a credit card application via a digital channel. 42% of credit card applications were completed on a mobile device. The top incentives when selecting a rewards card are cash back (81%), gas rewards (74%) and retail gift cards (71%). Understanding and having a full view of your customers’ activity, behaviors and preferences can help maximize your wallet share. More credit card insight>
First-party fraud is an identity-centric risk that changes over time. And the fact that no one knows the true size of first-party fraud is not the problem. It’s a symptom. First-party fraud involves a person making financial commitments or defaulting on existing commitments using their own identity, a manipulated version of their own identity or a synthetic identity they control. With the identity owner involved, a critical piece of the puzzle is lost. Because fraud “treatments” tend to be all-or-nothing and rely on a victim, the consequences of applying traditional fraud strategies when first-party fraud is suspected can be too harsh and significantly damage the customer relationship. Without feedback from a victim, first-party fraud hides in plain sight — in credit losses. As a collective, we’ve created lots of subsets of losses that nibble around the edges of first-party fraud, and we focus on reducing those. But I can’t help thinking if we were really trying to solve first-party fraud, we would collectively be doing a better job of measuring it. As the saying goes, “If you can’t measure it, you can’t improve it.” Because behaviors exhibited during first-party fraud are difficult to distinguish from those of legitimate consumers who’ve encountered catastrophic life events, such as illness and unemployment, individual account performance isn’t typically a good measurement. First-party fraud is a person-level event rather than an account-level event and needs to be viewed as such. So why does first-party fraud slip through the cracks? Existing, third-party fraud prevention tools aren’t trained to detect it. Underwriting relies on a point-in-time assessment, leaving lenders blind to intentions that may change after booking. When first-party fraud occurs, the different organizations that suffer losses attach different names to it based on their account-level view. It’s hidden in credit losses, preventing you from identifying it for future analysis. As an industry, we aren’t going to be able to solve the problem of first-party fraud as long as three different organizations can look at an individual and declare, “Never pay!” “No. Bust-out!” “No! Charge-off!” So, what do we need to stop doing? Stop thinking that it’s a different problem based on when you enter the picture. Whether you opened an account five years ago or 5 minutes ago doesn’t change the problem. It’s still first-party fraud if the person who owns the identity is the one misusing it. Stop thinking that the financial performance of an account you maintain is the only relevant data. And what do we need to start doing? See and treat first-party fraud as a continuous Leverage machine learning techniques and robust data (including your own observations) to monitor for emerging risk over Apply multiple levels of treatments to respond and tighten controls/reduce exposure as risk Define first-party fraud using a broader set of elements beyond your individual observations.
Customer Identification Program (CIP) solution through CrossCore® Every day, I work closely with clients to reduce the negative side effects of fraud prevention. I hear the need for lower false-positive rates; maximum fraud detection in populations; and simple, streamlined verification processes. Lately, more conversations have turned toward ID verification needs for Customer Information Program (CIP) administration. As it turns out, barriers to growth, high customer friction and high costs dominate the CIP landscape. While the marketplace struggles to manage the impact of fraud prevention, CIP routinely disrupts more than 10 percent of new customer acquisitions. Internally at Experian, we talk about this as the biggest ID problem our customers aren’t solving. Think about this: The fight for business in the CIP space quickly turned to price, and price was defined by unit cost. But what’s the real cost? One of the dominant CIP solutions uses a series of hyperlinks to connect identity data. Every click is a new charge. Their website invites users to dig into the data — manually. Users keep digging, and they keep paying. And the challenges don’t stop there. Consider the data sources used for these solutions. The winners of the price fight built CIP solutions around credit bureau header data. What does that do for growth? If the identity wasn’t sufficiently verified when a credit report was pulled, does it make sense to go back to the same data source? Keep digging. Cha-ching, cha-ching. Right about now, you might be feeling like there’s some sleight of hand going on. The true cost of CIP administration is much more than a single unit price. It’s many units, manual effort, recycled data and frustrated customers — and it impacts far more clients than fraud prevention. CIP needs have moved far beyond the demand for a low-cost solution. We’re thrilled to be leading the move toward more robust data and decision capabilities to CIP through CrossCore®. With its open architecture and flexible decision structure, our CrossCore platform enables access to a diverse and robust set of data sources to meet these needs. CrossCore unites Experian data, client data and a growing list of available partner data to deliver an intelligent and cost-conscious approach to managing fraud and identity challenges. The next step will unify CIP administration, fraud analytics and a range of verification treatment options together on the CrossCore platform as well. Spoiler alert. We’ve already taken that step.
Trivia question: Millennials don’t purchase new vehicles. True or False?
There are many factors attributing to the success of dealerships. When it comes to dealers, empirical guidance is a great way to study effective advertising. Experian brought Auto, Targeting, and the Dealer Positioning System capabilities together in a nationwide study to answer the ultimate question: what drives sales? The answers can be found in Experian’s 2018 Attribution Study. This is a wide-ranging, dealer-focused sales-driven attribution study that analyzed a few key variables. We deployed 187,701 tracking pixels to devices in 41,012 distinct households, focused on 15 digital metrics to learn about shopper behavior, and tied that digital shopping data to 2,436 vehicle sales. An industry first, Experian’s ability to combine automotive registration data, sales data, and website analytics and online behavior data puts us in a position to do something that very few companies can do. We use the household identifiers to not only see who bought a car and who bought specifically from a participating dealer, but also how they shopped the dealer’s site. Our ability to accurately identify a household’s digital behavior is based on the fact that we are a source compiler of the data and have it sitting under one roof. Others that attempt to provide this type of insight need to contract out for registrations, sales data imports from the dealership, website analytics, household identifiers, or all the above, which generally adds time to the insights. Using our sales-based approach, we can deliver unbiased attribution. Sales-based attribution is attributing credit to different advertising sources/campaigns based on actual vehicle sales – including those targeted consumers that may have purchased outside of the dealership. This is the Holy Grail of attribution for car dealers since it ties an offline activity such as buying a car back to the online advertising that’s taking up most their budgets every month. Because of that offline-online disconnect, sales-based attribution is difficult. Other automotive attribution models are typically focused on website conversions or website behavior – “what advertising can I attribute website leads to” (conversions) or “what advertising is driving users who follow the behavior that I think shows they’re likely to buy from my dealership” (website behavior.) What are the takeaways? We found three takeaways from our study. First off, we look at shopper behavior instead of isolating KPIs. Later we will discuss how traditional website metrics do not tie-in to sales. Second, we look at optimizing your paid advertising. Finally, we look at third-party investments. Although third parties drive sales, they may not be your sales. Looking at shopper behavior, not isolated KPI’s Traditional website metrics don’t tell the sales story for dealers. Traditional conversion stats are equal for buyers vs. all traffic such as VDPs or page views What this means is on average, buyers converted at a lower rate than overall website traffic. Looking solely at form submissions, hours and directions pageviews, and mobile clicks-to-call, don’t give the best view of what advertising is driving sales. With that, 98% of buyer traffic never submitted a form or went to the hours and directions page. This is a typical website conversion that dealers, vendors, and advertising agencies focus on. Since traditional web metrics don’t tell the story, there is another way. These are called High-Value Users, or HVU. They purchase at a 34% higher rate than overall traffic although they make up 11% of all traffic. High-Value Users are an Experian derived KPI. What makes someone an HVU are four different measurements. They must visit a website at least three times Spend at least six minutes on the site in total View at least eight pages in total View at least one VDP High-Value Users correlates to sales better than Vehicle Detail Page or VDP metrics. In this study, the correlation for VDP was measured at .595 which is rated a medium correlation. Meanwhile, HVU scored a .698 which is rated a high correlation. Looking at many different behavioral KPIs, like we do with our High-Value User (HVU) metric, correlates better to sales than just looking at how many VDPs you had. Driving more VDPs won’t necessarily help sales. But driving more HVUs is more likely to correlate with more sales. This also gets back to the attribution discussion above: Experian sales-based attribution is the best, and Experian’s HVUs are a good method for web-based attribution. From this attribution study, High-Value Users are a vital group for dealers to utilize. In our next post, we will go over the second and third takeaways from the attribution study: optimizing paid advertising and evaluating third-party investments.
Although it’s hard to imagine, some synthetic identities are being used for purposes other than fraud. Here are 3 types of common synthetic identities and why they’re created: Bad — To circumvent lag times and delays in establishing a legitimate identity and data footprint. Worse — To “repair” credit, hoping to start again with a higher credit rating under a new, assumed identity. Worst — To commit fraud by opening various accounts with no intention of paying those debts or service fees. While all these synthetic identity types are detrimental to the ecosystem shared by consumers, institutions and service providers, they should be separated by type — guiding appropriate treatment. Learn more in our new white paper produced with Whitepages Pro, Fighting synthetic identity theft: getting beyond Social Security numbers. Download now>
The economy remains steady, maintaining a positive outlook even though the GDP growth slowed in the first quarter. Real estate is holding ground even as rates rise. We’ve reached a 7-year high in 30-year fixed-rate mortgages, which could have a longer-term effect on this market. Bankcard may be reaching its limit — outstanding balances hit $764 billion and delinquency rates continue to rise. While auto originations were flat in Q1, performance is improving as focus moves away from subprime lending. The economy remains steady as we transition from 2017. Keep an eye on inflation and interest rates in regard to their possible short-term economic impact. Learn more about these and other economic trends with the on-demand recording of the webinar. Watch now
There is a delicate balance in delivering a digital experience that instills confidence while providing easy and convenient account access. When it comes to a frictionless, secure customer experience, our 2018 Global Fraud and Identity Report research showed: 52% of businesses have chosen to prioritize the user experience over detecting the mitigating fraud. 78% of consumers will create an account to complete ecommerce purchased because it is a trusted brand/website. 60% of consumers will follow through with a transaction even if they have forgotten their user name or password. Consumers believe that having simple, instant and easy-to-access verification methods are important to their experience when shopping online. Are your providing this? 2018 Global Fraud and Identity Report
Marketers are keenly aware of how important it is to “Know thy customer.” Yet customer knowledge isn’t restricted to the marketing-savvy. It’s also essential to credit risk managers and model developers. Identifying and separating customers into distinct groups based on various types of behavior is foundational to building effective custom models. This integral part of custom model development is known as segmentation analysis. Segmentation is the process of dividing customers or prospects into groupings based on similar behaviors such as length of time as a customer or payment patterns like credit card revolvers versus transactors. The more similar or homogeneous the customer grouping, the less variation across the customer segments are included in each segment’s custom model development. So how many scorecards are needed to aptly score and mitigate credit risk? There are several general principles we’ve learned over the course of developing hundreds of models that help determine whether multiple scorecards are warranted and, if so, how many. A robust segmentation analysis contains two components. The first is the generation of potential segments, and the second is the evaluation of such segments. Here I’ll discuss the generation of potential segments within a segmentation scheme. A second blog post will continue with a discussion on evaluation of such segments. When generating a customer segmentation scheme, several approaches are worth considering: heuristic, empirical and combined. A heuristic approach considers business learnings obtained through trial and error or experimental design. Portfolio managers will have insight on how segments of their portfolio behave differently that can and often should be included within a segmentation analysis. An empirical approach is data-driven and involves the use of quantitative techniques to evaluate potential customer segmentation splits. During this approach, statistical analysis is performed to identify forms of behavior across the customer population. Different interactive behavior for different segments of the overall population will correspond to different predictive patterns for these predictor variables, signifying that separate segment scorecards will be beneficial. Finally, a combination of heuristic and empirical approaches considers both the business needs and data-driven results. Once the set of potential customer segments has been identified, the next step in a segmentation analysis is the evaluation of those segments. Stay tuned as we look further into this topic. Learn more about how Experian Decision Analytics can help you with your segmentation or custom model development needs.