Advanced Analytics in the Insurance Industry

Comments · 249 Views

The Advanced Analytics market has witnessed strong, continuous growth in the past few years and is projected to continue this same path. The insurance industry is relying heavily on two major areas of advanced analytics, big data analytics and predictive analytics, to solve their data issu


Advanced Analytics is a comprehensive set of analytical methods and techniques designed to help businesses discover patterns and trends, accurately make predictions, solve problems, and discover deeper insights or generate recommendations. Currently, industries use advanced analytic techniques to save millions of dollars. Advanced analytics is an umbrella term for several sub-fields of analytics that work together using predictive capabilities. Nowadays, many industries including the insurance industry use advanced analytic techniques to solve data-related issuesAn advanced data analytics package can constantly monitor expenditures for healthcare insurance services, using large volumes of data drawn from multiple sources, and detect unique patterns or anomalies related to possible fraud, waste and abuse that might not have been detected to a human using manual means of data inspection. Fraud is generally defined as knowingly and willfully executing, or attempting to execute fraud activities on any healthcare benefit program or to obtain (by means of false or fraudulent pretenses, representations or promises) any of the money or property owned by, or under the custody or control of, any healthcare benefit program, waste is overutilization of services or other practices that, directly or indirectly, result in unnecessary costs to the healthcare system, including the Medicare and Medicaid programs. It is not generally considered to be caused by criminally negligent actions, but by the misuse of resources and abuse is payment for items or services when there is no legal entitlement to that payment and the individual or entity has not knowingly and/or intentionally misrepresented facts to obtain payment. Using advanced data analytics in FWA programs can also reduce the frequency of false positives so that special investigation units and provider audit groups can focus their efforts where they are most likely to yield results.

Traditional Analysis in FWA programs and their shortcomings:

Traditional analysis in the healthcare insurance sector has primarily employed claims data since it already includes substantial detail.  However, claims data can lead to a misleading picture and can be incorrect, because it is incomplete, and patterns may get undiscovered. However, with advanced analytics, we can analyze demographic, geographic, and other data about the member, the payer will realize it is virtually impossible for any member treated by that provider to pay the copay, highlighting possible improper activity. Historically, state program integrity (PI) initiatives have focused on the retrospective recovery of paid claims, often referred to as “pay and chase.” The payment of improper claims has the potential to grow as Medicaid becomes larger and more complex. It is not uncommon for providers to make coding mistakes or misunderstand ordering and billing rules, and it’s a safe bet that Fraud and Abuse will continue to grow in sophistication. 

Challenges in Fraud, Waste and Abuse Programs:

Each year, healthcare fraud, waste, and abuse costs across hundreds of billions of dollars in the U.S., shifting down to tens of millions of dollars for each payer. Payers are consistently challenged by the limited resources available to them to process the growing make timely payments and number of claims. At the same time, they need to continuously monitor for suspicious claims to prevent revenue losses. Identifying new patterns of aberrant behavior is a slow process, requiring many analysts to compare reports from different sources before they can confirm a new trend. This allows sophisticated fraudsters to rapidly evolve their strategies and outpace current detection models. Current fraud detection models are primarily rule-based and incorporate well-known indicators of fraud. For example, if a patient visits a certain hospital more than three times in one month or if the patient receives services from a facility more than 100 miles from their residential address, the claims can be flagged by rules that incorporate these established indicators. However, the more subtle and evolving indicators of suspicious behavior are scattered throughout complex, medical claims data and may be missed by traditional analytical techniques.

Prospects of Advanced Analytics in the Healthcare Insurance Industry


Predictive analytics: A data-driven approach to cost avoidance

Predictive analytics are increasingly used to compare Medicaid claims across provider peer groups and validated benchmarks. Claims failing to meet expected patterns in type and frequency of visits, diagnoses, prescriptions, and other factors are flagged for investigation.

These same technologies enable PI staff to categorically score providers to determine the level of screening, such as a background check or site visit. Visual displays of relationships among providers, service organizations, and beneficiaries help states identify and investigate fraudulent or ineligible providers, remove them from the system and prevent them from re-enrolling.

Cost avoidance models

The following analytics-based approaches may be applied:

  • Predictive modeling: It is an umbrella term for a variety of methods that analyze relevant historical data to create a statistical model of future behavior. Analysts make predictions based on the model, “train” it to recognize the probability of behavior, and then apply it to incoming claims. In general, prepayment predictive modeling focuses primarily on specific transactions and office visits; post-payment predictive modeling tends to examine patient and provider behavior.
  • Risk scoring: It is like a consumer credit score, is used to assess providers based on a predictive model that analyzes billing, claims, and other relevant public and private data. A risk scoring model might evaluate providers based on 15 variables and develop scores between 0 and 1,000, with the riskiest providers scoring over 900. The overall risk score is a single metric that provides an at-a-glance evaluation and allows providers to be categorized according to risk level, but an analyst can dig deeper and look at individual scores for each variable.
  • Link analysis: It a predictive modeling approach that identifies unusual or hidden relationships, helps expose fraudulent providers. Link analysis can examine the various components of false identities, including provider names, aliases, Social Security numbers, locations, addresses and phone numbers to find links between providers, suppliers, employees and beneficiaries. Using algorithms to evaluate these connections, link analysis generates visualizations that map this network, reveal relationships among providers and known criminals, and simplify the process of data interpretation.
  • Trend analysis: It is an analysis of behavior or activity over time to identify trends and project future direction. For example, if abnormal provider activity raises a red flag, trend analysis can explore historical behavior. Or it might be used to analyze a less specific query for random or geographic audits and to predict future trends or behavior. Trend analysis can help states evaluate Medicaid policy, which is especially important as the Medicaid population grows and changes under the ACA.
  • Spike analysis: It recognizes more obvious behavioral changes by illustrating them as “spikes” that stand out from normal behavior. Spikes may not be out of the ordinary depending on location, time of year, or other developments, and they can be used to predict future spikes. On the other hand, a provider’s billing behavior might look flat or “normal” when viewed historically. But if his or her identity and credentials are stolen and used illegally, a spike in the number of claims, patients, or services will be obvious to data analysts.
  • Cluster analysis: It is a type of predictive modeling in which common data objects are automatically grouped into discrete clusters based on predetermined parameters. Clusters of providers can be compared to determine normal billing behavior and indicate trends. Multiple algorithms can be used to create clusters depending on the type of model needed. Analysts may experiment with many different clustering algorithms to find the model that best answers a particular question.


Workflow of Big Data and Predictive Analysis Processing:


Predictive analytics utilizes techniques such as machine learning and data mining to predict what might happen next. It can never predict the future, but it can look at existing data and determine a likely outcome. Data analysts can build predictive models once they have big data to make predicted outcomes. There are two types of predictive models. Classification models and Regression models. Classification models predict class membership and Regression models predict a number – for example, how much revenue a customer will generate over the next year or the number of months before a component will fail on a machine. Three of the most widely used predictive modeling techniques are decision trees, regression, and neural networks.

Regression (linear and logistic) is one of the most popular methods in statistics. Regression analysis estimates relationships among variables. It finds key patterns in large data sets and is often used to determine how much specific factors, such as the price, influence the movement of an asset.

Decision trees are classification models that partition data into subsets based on categories of input variables. This helps you understand someone's path of decisions. A decision tree looks like a tree with each branch representing a choice between a number of alternatives, and each leaf representing a classification or decision. This model looks at the data and tries to find the one variable that splits the data into logical groups that are the most different.

Neural networks are sophisticated techniques capable of modeling extremely complex relationships. They handle nonlinear relationships in data, which is increasingly common as we collect more data. Neural networks are based on pattern recognition and some AI processes that graphically “model” parameters. They work well when no mathematical formula is known that relates inputs to outputs, the prediction is more important than explanation or there is a lot of training data. 

Current Vendor landscape for Advanced Analytics:

Organizations across the globe are using advanced analytics and data science to predict and make decisions. Advanced and predictive analytics are being applied in a range of areas including fraud detection, security, safety, healthcare, and disaster response. IBM is one of the most competent vendors offering solutions for Big data and predictive analytics. Their Big data product suite includes InfoSphere Streams, InfoSphere BigInsights, IBM Watson Explorer, IBM PureData powered by Netezza technology, DB2 with BLU Acceleration, IBM Smart Analytics System, InfoSphere Information Server and InfoSphere Master Data Management. The Predictive Analytics software portfolio of IBM includes: IBM SPSS Analytic Server, IBM SPSS Statistics, SPSS Predictive Analytics Enterprise, IBM Analytical Decision Management, IBM SPSS Data Collection and IBM Social Media Analytics.


Insurance Industry Landscape for Advanced Analytics:

Till now, the insurance industry has been slower than many others in adopting new technologies. That is set to change with many insurers planning to make more use of data analytics. Most PC insurers (92% according a recent survey in the US) have planned initiatives around Big Data and advanced analytics. However, the existence of data silos means that many insurers are only at the early stages of building out the foundations for analytics initiatives as they are still ironing out legacy system challenges. Although industry IT spending has remained constant over the last few years (around 4% of premiums), analysts expect a realignment within static budgets as many insurers complete core systems updates and allocate more funds to newer initiatives like digital and analytics. The extent of investment will likely vary. Back in 2016, data and analytics leaders at global insurers said they were investing as much as USD 80 million in data analytics each year, and most said they planned to increase spending. IDC forecasts spending on Big Data and analytics solutions across all industries to grow at a CAGR of 13.2% through 2018‒2022, and we encourage insurers to keep pace. Larger insurers with global footprints spend more. For example, in 2015 Generali said it would reinvest EUR 1.25 billion (USD 1.42 billion) in technology and data analytics through 2018. However, insurers are less likely to invest in very large-scale projects since managing and harvesting benefits can be difficult.9 Most insurers have a range of carefully prioritized projects, and often start with narrow use cases that can be operationalized quickly so that value add is easier to demonstrate. For instance, QBE reports that its analytics teams managed to complete over 100 projects in 2018 and that its main focus remains on applying associated learnings to underwriting and claims. Estimates suggest that in the US, data and analytics projects will account for around 15% of PC insurers’ IT spending in 2019 (see Figure 2). It is hard to estimate a figure for global spending on data and analytics alone due to differences between markets. Gartner forecasts global insurer IT spending to reach USD 220 billion in 2019 (both PC and LH) and we conservatively estimate that 8‒10% of that (USD 18‒22 billion) will be annual outlay on data and analytics. This accounts for around 3% of the insurance industry’s expense base (expense ratio assumed to be 15% of global premiums of USD 5.3 trillion in 2019).




Additional potential use cases in the Insurance Industry

  1. Underwriting and Pricing Risk Assessment: Since the underlying nature of business in the insurance industry involves risk, advanced analytics is used to conduct a real-time risk analysis that enables organizations to be quick on their feet in a volatile risk environment. Analytics has experienced the highest penetration in underwriting and pricing, particularly in policy technical pricing. However, most such models lie within the realm of actuarial science rather than data science. The use of more advanced, machine learning-based tools remains low but could deliver huge value. Behavioral pricing, for example, builds on technical pricing to enable the insurer to optimize price customization for each customer. Still, few Iberian companies have taken steps toward developing this capability—even among direct insurers, for whom such capabilities are more intuitive given their digital backbone. In attempting to explain why, some interviewees noted that actuaries are reluctant to leave pricing decisions to data scientists, who may have less experience in this highly specialized domain.
  2.  Personalizing Marketing Strategies and Targeting Specific Customer Groups: Personalization is not a new concept in the insurance industry. Customers are willing to avail of services that best suit their needs and lifestyle and to look for personalized offers, policies, loyalty programs, and recommendations. In the era of extensive digital communication, insurance companies face the challenge of engaging their customers and communicating with them effectively. Advanced analytics is fueled into extracting insights from an expansive database that comprises various details on customers like demographic data, preferences, attitude, lifestyle details, interests, belief systems among many others. This helps insurance companies to make sure highly personalized and most appropriate experiences. A hypothesis/model on personalization and marketing strategies are formulated by using data acquired from various digital platforms which are then tailored to fit the customers need. Several analytical tools and mechanisms help companies achieve this outcome. Personalizing offers, policies, prices, recommendations, and marketing ads attribute to the success of acquiring customers and in turn, increase the insurance rates of a company.
  3. Influencing Customer Behaviour: Advanced analytics has also been employed by insurance companies to research telematics data and influence customer behavior. For instance, health insurance companies can capture data generated from IoT devices and wearable technology such as fitness trackers and analyze it to track variables that determine the health of a person and assess risk. By monitoring behavior and habits, insurance companies can provide a comprehensive assessment of their customers’ health and urge customers to take better care of their health, thereby mitigating the risks involved. Insurance companies can further go on to offer services and discounts and motivate customers to use fitness monitoring devices. John Hancock Financial, a renowned life insurance company, offers its customers with discounts on their premiums and a free Fitbit wearable monitor so that customers can work to reduce their premiums by presenting an evaluation of how they are progressing on their unhealthy and risky behaviors.
  4. Lifetime Value Prediction: Customer Lifetime Value (CLV) is predicted using customer behavior data to determine the customer’s profitability for the company. Behavior-based predictive models are used to process all the data on customers and arrive at a forecast on customer buying and retention. These models provide insights on the likelihood of customers’ behavior in maintenance or surrendering of a policy. CLV can also be leveraged for developing market strategies as it reflects one of the important customer characteristics.
  5. Claims Prediction: Predicting the turn of events in the future is of paramount interest to the insurance industry. Being able to form accurate claim predictions helps to mitigate risks, gain competitive advantage, and reduce financial losses.
    Advanced analytics propels some of the most complex processes involved in building financial models that have a large number of variables affecting the outcome. Algorithms are developed to recognize relationships between vast numbers of variables and detect several important parameters that are essential to building a customer portfolio.
    Forecasting the potential claims helps insurance companies to develop competitive and optimum premiums and improve pricing models.

Advanced Analytics Implementation Challenges and Strategies in the Insurance Industry

The insurance industry faces variety of unique challenges due to regulation, individually operating lines of business, varying depths of data sources, and more. Modern digital age competitors are challenging the business models of incumbents, making them embrace digital transformation or become extinct. One of those major challenges is fixing place a broader data strategy to form more business-ready information available to analysts and business teams. A 2016 study by Willis Tower Watson, surveyed insurance industry executives about their use of massive data. One question asked probed into the challenges and barriers the executives see. Second, behind people skills, is data availability.

Implementation Strategy:

Data richness and quality. Most insurers are investing in data lake technology—but few store data from across domains and functions. Even fewer have reached a level of maturity where the data lake is fully integrated with operations. In practice, this means that even if a model can be developed using all the available data, it is almost impossible to use the model unless the data is continuously and automatically updated. One interviewee mentioned, “We have a data lake, but it still lacks synchronization with all transactional data sources. We can use it to develop models but only based on information several months old.” Some insurers are exploring or have even developed partnerships with data aggregators to acquire external information on customers and prospects—but to date, external data use is rare.1

Modern modeling tools and techniques. All the insurers we interviewed have ramped up their use of modern tools (such as Python, R, or Scala) and machine-learning techniques (such as random forest, gradient boosting, or deep learning). Again, the extent and scope of their applications vary considerably. In most cases, these capabilities are confined to specific domains or teams, though some have extended them more broadly across the organization. Some insurers have even started using platforms that automate the tasks of a data scientist, such as DataRobot.

Analytics talent. Although our interviewees employ in-house AA experts, the majority also recognize they do not have enough. In-house experts may lack the appropriate knowledge and authority necessary to effect change; as such, they struggle to meet the target pace of use-case implementation. With notably competitive recruitment of such talent, insurers must invest considerable resources in finding, securing, and retaining such talent. In addition, while insurers are working to incorporate data scientists into their ranks, they do not always focus on crucial complementary roles such as data engineers, data architects, workflow integrators, and business translators.

Effective business integration. Some AA models have been integrated effectively into the business, but there is still a long way to go, particularly regarding decision-management processes and staff training. While the connection between an insight-generating model and the decision-making process should be automatic, full integration is rare. For example, a model that aims to boost customer retention rates has little value if the list of prioritized customers is not properly integrated into a call center’s workflow. Updating end users with the new process often pose another roadblock. As the chief analytics officer of a leading health insurer observed, “Sometimes we prefer to defer model development, even at the cost of having our budget reduced, just because we do not want to push for something that neither the end-users nor the technology is ready to digest.” Another interviewee commented, “The analytics teams believe their job ends once the model is developed, but no one is really taking care end to end to make the models work.” And a third said, “The challenge is not resistance to change, it’s lack of integration of the model into the workflow.”


Analytics are an increasingly important lever to be proactive related to FWA. Traditional outlier detection algorithms provide a front-line defense; however, hidden relationships and correlations can still exist. Advanced analytics can provide a robust way to show hidden relationships of sub cohorts to proactively impact waste. The insights and reporting offered by healthcare analytics could, for example, help managed care executive address persistent volatility in cost for both medical and pharmacy claims with a lens on population sub cohorts to positively impact outcomes. Predictive analytics are a powerful way to enhance decision making with confidence. By leveraging existing claims data, it is possible to creatively find options to lower costs while improving population health. Advanced analytics and emerging technologies provide this option.

A customized, well-implemented predictive analytics solution should both address an immediate problem and make sure it doesn't return in the future. Healthcare executives should be able to recognize the positive change that can arise from taking the right approach to predictive analytics, and look for areas where it could potentially improve their operations-from reducing pharmacy costs to improving patient health outcomes.