Understanding the Four Data Types for AI-Based Personalization in Customer Experience Management
Why do different companies value the same types of data so differently?
Introduction
Over the past five years or so (ever since the Covid pandemic), most marketers have gotten on board the AI-based personalization train. In doing so, however, a fundamental tension has arisen in Fortune 500 company boardrooms and in the strategy sessions at customer-focused AI startups alike. On the one hand, marketing executives at CPG brands like Pepsico and Procter & Gamble have committed to “mass one-to-one marketing” powered by sophisticated AI systems. At the same time, they are having to deal with more and more privacy regulations like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) that restrict the data available to them (Adams, 2024; Barnett, 2022; Michu, 2024; Ramaswamy, 2016; Snowflake, 2025).
We are at an interesting and potentially game-changing inflection point. CX executives at companies like Adobe, Home Depot, and Starbucks are pushing for hyper-personalized customer journeys at the same time their legal departments are raising red flags about their data collection practices, and their customers want to curb their intrusive overreach and are demanding transparency (Branscum, 2024; Dholakia, 2023; Korganbekova & Zuber, 2023). This so-called privacy-personalization paradox, requiring more and more data for effective personalization while facing greater constraints on data collection, has spread widely, spanning every industry.
Behind closed doors at many companies, Chief Data Officers and Chief Marketing Officers are engaging in increasingly passionate debates about which types of data should form the foundation of their CX personalization strategies. Some, especially those who have the ability to engage in regular conversations with their customers, argue for an all-in approach to zero-party data, which includes preferences and evaluations provided by customers. Others, like Netflix, favor sophisticated first-party data collection infrastructures that passively observe every (mostly digital) customer action at scale. Still others advocate for strategic data partnerships or going with legacy third-party data vendors to fill crucial gaps in their customer insights. The reality is that an effective CX personalization strategy is highly context-dependent: a data approach that works for one brand or company won’t work for another.
No matter your position, the specific argument about which data types are valuable and why goes to the heart of a core question in AI-based personalization: Why do different organizations value the same types of data so differently? In recent years, with the increasing attention given to CX personalization, numerous academics and practitioners have grappled with this question. Their debates shine light on the fragmentation of data strategies as organizations navigate an increasingly complex and rapidly evolving landscape of customer expectations, privacy regulations, and technological capabilities (Davenport & Bean, 2018).
The effectiveness of AI-based CX personalization is directly associated with the quality, source, and nature of consent underlying the data that is used. Thus, understanding the distinctions between zero-party, first-party, second-party, and third-party data is crucial for developing ethical, effective, and compliant CX personalization strategies and delivering useful, relevant, and memorable experiences to customers. Each type of data offers distinct advantages and limitations, shaping how AI-based personalization initiatives interpret customer needs and respond to them. Zero-party data captures stated customer intent, first-party data reveals behavioral patterns of customers, second-party data extends the reach of customer understanding through data partnerships with other organizations, and third-party data broadens the scope of personalization by providing aggregated insights.
This article explores the four categories of customer data in detail, examining their roles in AI-based personalization through numerous case studies of specific brands and companies drawn from a variety of industries. By understanding the applications and constraints of these data, organizations can craft strategies that enhance customer engagement while navigating the ethical and operational intricacies of data-driven decision-making1.
Zero-party data
Forrester Research coined the term zero-party data back in 2018, defining it as “data that a customer intentionally and proactively shares with a brand” (Forrester, 2021). While the term is new, the idea behind it is as old as marketing research itself. Marketing researchers have collected this sort of data in different forms and for different reasons for many decades through surveys. Zero-party data has important ramifications for AI-based personalization. Unlike the other three data types that rely on observation, inference, or imputation, zero-party data is derived from direct, purposeful communication between the customer and the brand, creating a foundation for personalization strategies directly from “the mouths of customers.” This provenance is particularly important for brands that rely on personalization as the basis of their value proposition, as the following examples illustrate.
Sephora. Consider the case of personal care and beauty product retailer Sephora, whose “Beauty Insider” quiz asks customers detailed questions about their top skincare concerns (e.g., acne and blemishes), product preferences (e.g., moisturizers, masks), skin type (e.g., dry) and beauty goals, and interactively generates a detailed beauty profile at the outset of the relationship. Mini-quizzes and surveys supplement this knowledge base from time to time to gather more information from the customer about their needs, preferences, and evaluations. Over time, the deliberate, orchestrated collection of customer-provided information has resulted in a rich dataset for Sephora that, in turn, has powered its recommendation engine with explicitly stated preferences rather than inferred ones. When Sephora’s AI suggests products to its customers, it does so based on what they have volunteered about themselves and their preferences, creating a level of personalization that is seen as relevant and respectful (Lindecrantz et al., 2022).
Stitch Fix. Along similar lines, the digital fashion service Stitch Fix, which delivers personalized clothing and accessories to its customers periodically on a subscription basis, has built its entire business model around zero-party data. New customers complete an extensive style quiz covering size, fit preferences, price sensitivity, style inclinations, and even how adventurous they want their selections to be. This information becomes the basis for Stitch Fix’s hybrid algorithm (which combines AI output with the judgments of human stylists) in the selections it presents to customers. The company reports that customers who provide more detailed zero-party data through the quiz and subsequent feedback have significantly higher satisfaction rates and lower return frequencies (Lake, 2018; Stitch Fix, 2024).
Furthermore, the value of zero-party data extends beyond immediate or short-term personalization campaigns. For instance, when outdoor retailer REI asks members about their favorite activities and adventure aspirations, they are not just collecting data to deliver “next product” recommendations or personalize landing pages (both of which they do). They are gathering customer insights to inform overall content creation, community events, and even product development over the coming quarters and years, with a distinctly strategic bent. In this case, the organization’s zero-party data becomes a strategic asset informing multiple touchpoints in the customer journey archetypes over a lengthy period. Interestingly, in the 1990s and 2000s, many customer-focused organizations, such as well-run retail banks, had a similar idea. They comprehensively and frequently surveyed their customers, using that information to create offers their customers said they really wanted. They just didn’t have the technology to deliver truly personalized messages or offers like today.
Importantly, zero-party data has at least three inherent limitations. The first limitation arises from the fundamental psychological gap between what people say and what they do. Zero-party data relies strictly on the former, often leading to misleading signals even for tactical decision making. As one example, food delivery service HelloFresh discovered that while their taste preference surveys yielded what seemed to be valuable insights, they lacked accuracy. Many subscribers routinely provided aspirational preferences instead of what they actually liked. For instance, some customers would indicate a desire for adventurous, exotic meals when filling out the initial survey, but their actual ordering patterns and recipe ratings revealed a preference for familiar, relatively boring comfort foods (Higley & Snyder, 2024; Yoon & Meyvis, 2024). This gap between customers’ stated preferences and their actual behavior highlights perhaps the most significant problem with relying too heavily on zero-party data, with the exclusion of the other types of data, as developed later in this article2.
A second, potentially serious limitation involves the burdens associated with collecting customer preference and opinion data, especially given the current mounting challenges with survey-based research3. This difficulty has led many companies to shift their focus away from zero-party data. For instance, over the years, Netflix has experimented extensively with using preference questionnaires for its customers, even providing significant incentives for participating in proprietary panel-based research to do so4. However, it has found that completion rates plummeted when surveys extended beyond a certain length, and many responses were aspirational rather than realistic. Netflix ultimately shifted to a hybrid approach, combining relatively less zero-party data collection (rating a movie or show, clicking the thumbs up icon, etc.) with more extensive behavioral analysis (i.e., first-party data). The company also moved to define subscriber engagement using measures of actual viewing activity (Netflix, 2023), essentially finding that what subscribers watch provides more reliable guidance than what they say they want to watch or like to watch (Gomez-Uribe & Hunt, 2015).
Third, zero-party data has the inherent constraint of limited scale and completeness: it relies on customers’ voluntary provision of information, significantly restricting the volume or comprehensiveness of data collected when compared to passive data-gathering techniques. When a company distributes a detailed preference survey to its customers, the response rates are typically modest. As an example, even well-designed, well-compensated surveys rarely achieve response rates above 25% (usually far less), indicating that information about the vast majority of customers remains unknown (e.g., Dillman et al., 2014). Similarly, website preference centers, designed to capture user interests, see engagement from a small fraction of customers (e.g., less than 15% in one EY-Parthenon study), leaving the majority of preferences unknown.
Another effective way to collect zero-party data is customer loyalty programs, which yield valuable data from enrolled participants, but miss a significant number of the total customer base and sales transactions. This is true even for the most successful loyalty programs. For example, the Starbucks rewards program, considered by many to be the gold standard, is responsible for under 60% of its US sales (Soper, 2024), meaning that more than 40% of its sales go to customers with no zero-party data, and therefore without visibility for Starbucks. Interactive tools and quizzes have the analogous limitation; they engage a small proportion of self-selecting participants and exclude the vast majority who are not interested. Consequently, while ethically above board, the voluntary nature of zero-party data inherently constrains its scope relative to the other types of data that harvest information from broader customer populations with more nuanced consent arrangements.
To summarize, zero-party data offers the clearest and most self-reliant, if not the most effective, means to implement AI-based personalization strategies in customer experience management. When a customer explicitly states a preference and then receives recommendations based on that preference, the link between data and company action is transparent and builds trust. What’s more, the brand can explain the logic behind the recommendation. Such explainability is increasingly valuable as AI systems face growing scrutiny from regulators and consumers.
First-party data
If zero-party data represents what customers tell the brand about themselves, first-party data reveals to the brand what they actually do. These behavioral variables, collected directly through the company’s owned channels and customer interactions at its touchpoints, form the basis of many sophisticated AI-based personalization efforts (Davenport & Bean, 2018). Amazon’s recommendation engine provides perhaps the most widely experienced (at least among US consumers) example of first-party data powering AI-based personalization at scale, with Netflix a close second. The retailer’s algorithms analyze browsing patterns, purchase history, cart abandonment, review behavior, and countless other interactions to create increasingly refined customer profiles. These profiles then power the “Customers who bought this also bought...” and “Recommended for you” features that drive a significant portion of Amazon’s revenue (Smith & Linden, 2017).
The strength of first-party data harvesting lies in its self-reinforcing mechanism: as more first-party data is collected, the precision of recommendations and other personalization activities improves, leading to greater customer engagement, which, in turn, affords the opportunities, e.g., through greater receptivity, to gather additional data for analysis (Schrage & Kiron, 2018). Furthermore, the consequences of this virtuous cycle have been at least partly responsible for the so-called “Amazon Effect,” which has raised overall consumer expectations and increased the challenges associated with delivering superior customer experiences not just for other retailers but for every customer-focused organization regardless of industry (Vollero, Saradanelli, & Siano, 2023).
Looking beyond retailing, the financial services sector also provides evidence of the transformative potential of first-party data. Capital One leverages data from spending patterns, payment histories, and interactions within its digital banking platforms to deliver personalized financial insights. As one example, its “Second Look” feature is a spending alert tool that uses AI to identify spending anomalies such as unusual expenditures or potential duplicate charges and send an alert to customers in a timely fashion for verification (Poole, 2017). This service provides significant value to customers by enhancing the financial oversight of their accounts with Capital One and enriching the bank’s understanding of typical behavioral trends, strengthening its personalization capabilities.
In the healthcare industry also, first-party data can drive effective CX applications. As one example, Livongo, now integrated into Teladoc, collects glucose readings from diabetic patients via connected devices, constructing individualized health profiles that facilitate timely interventions and tailored recommendations. Studies have demonstrated that this data-driven approach yields measurable improvements in patient outcomes while reducing healthcare costs (e.g., Amante et al., 2021), highlighting the utility of first-party data beyond traditional CX contexts. Similarly, wearable technologies such as Fitbit and Apple Watch employ first-party data such as step counts, sleep patterns, heart rate metrics, blood oxygen levels, and so on to send warnings and notifications, offer personalized fitness guidance, and so on.
Finally, the hospitality industry has also recognized the value of first-party data. Realizing its strategic significance, Marriott has revamped its Bonvoy app to collect more meaningful first-party data beyond bookings. The enhanced app has a larger scope, capturing guest preferences around amenities, service interactions, and on-property behaviors during every stay at a Marriott property and gradually creating more holistic customer profiles with the ability to power more sophisticated personalization over the guest journey.
Bridging the intentions-behavior gap. A key strength of first-party data lies in its objectivity and reliability. As discussed earlier, whereas zero-party data often captures aspirational and inaccurate stated preferences (e.g., I want to eat healthier), first-party data captures actual behaviors (e.g., purchasing a one-gallon bucket of ice cream at midnight). This gap between stated intentions and actual behaviors makes first-party data particularly valuable for AI-based personalization applications for customers that aim to predict future behaviors rather than respond to stated preferences. For instance, research by Nielsen shows this discrepancy can be substantial: whereas 70% of consumers claim to read nutritional information, eye-tracking studies reveal that only 9% examine this data when they are shopping (Nielsen, 2019). Similarly, Spotify discovered that users who manually created workout playlists often selected very different songs than they listened to during exercise, as measured by their mobile accelerometer data (Bloudoff-Indelicato, 2019). Understanding these types of behavioral contradictions permits more accurate personalization than relying solely on the customers’ self-reported preferences.
Identifying Temporal Patterns. Beyond individual behaviors, first-party data reveals invaluable and often hidden temporal patterns in customer behavior that are inaccessible in any other way. The sequence and timing of customer actions or interactions often contain more predictive power than the actions or interactions themselves. Netflix, for instance, found that viewing patterns provide critical signals that allow more impactful personalization. For instance, whether a subscriber binge-watches a series in one weekend or spreads it over months reveals important preference information that influences recommendations (Amatriain & Basilico, 2016).
Similarly, financial services companies like American Express utilize temporal transaction data to identify life events even before customers explicitly acknowledge them. (Recall the famous early personalization case of Target knowing one of its customers was pregnant even before her family knew about it; Duhigg, 2012). Changes in spending patterns, such as increased purchases at home improvement stores, baby retailers, or wedding vendors, often signal major life transitions, creating opportunities for timely, relevant offers long before a customer might volunteer this information. This sort of “predictive personalization” based on behavioral patterns represents one of first-party data’s most powerful applications.
The three organizational capabilities for effective first-party data deployment. Collecting first-party data requires substantial technical infrastructure and capabilities. The most sophisticated CX practitioners who rely on first-party data as the cornerstone of their personalization campaigns develop three critical capabilities: (1) unified identity resolution, (2) real-time processing, and (3) ethical governance frameworks. I will cover these ideas briefly here, acknowledging that they require further development in future notes.
First, identity resolution, defined as the ability to recognize the same customer across devices and sessions, forms the basis of effective first-party data strategies. For example, Target builds “guest profiles” that connect the customers’ in-store purchases, online browsing, mobile app usage, and loyalty program interactions into one consolidated, enriched customer view. This unified profile enables consistent personalization across channels (Taylor, 2020). Most other top-tier retailers are adopting a similar approach, building digital dossiers of their customers containing hundreds of behavioral data points and adding to them all the time (Abraham & Edelman, 2024). Some of this data also comes from second-party and third-party sources, as discussed later in this note.
Second, real-time processing capabilities allow companies to act on behavioral signals instantaneously, turning personalization into something that seems like magic to customers, even as the underlying mechanism that generated the completely personalized connection remains opaque to them. For instance, the streaming service Disney+ has been analyzing the viewing behavior of its subscribers in real time to adjust its recommendation algorithm dynamically, even during the same viewing session. If a viewer begins watching children’s content in the morning and then switches to adult dramas in the evening, the interface adapts immediately rather than averaging these preferences over time, providing the appropriate recommendations based on when that customer turns on the TV to watch Disney+ (Whelan & Toonkel, 2024). There are many interesting and evolving case studies of the remarkable impacts of real-time processing of first-party data, and the influence and power of these initiatives in delivering new customer experiences and improving existing ones will only increase.
Third, ethical governance frameworks, often controlled by evolving laws (and using older laws to suit the current technological landscape), are critical for effective CX activities. These frameworks ensure that the use of first-party data respects both the current regulatory requirements as they evolve and become more stringent, and customer expectations as they change with greater knowledge and understanding among customers about what is at stake and the potential downsides to them. As one example of getting out in front of the problem, the digital payment platform PayPal has implemented “data ethics by design” principles that evaluate potential personalization applications against both compliance standards and customer trust impacts before deployment. More broadly, the company has proactively integrated ethical principles such as privacy, transparency, accountability, and fairness into the architecture of its data systems and processes from their inception rather than as an afterthought (Paypal, 2024). Such an approach helps prevent privacy backlash from customers or from other stakeholders, say through social media, that could undermine the company’s personalization efforts from regulatory actions or deteriorating relationships with irate consumers.
Using first-party data in cross-channel applications. While digital interactions generate the most visible first-party data, leading organizations increasingly capture and apply behavioral data across physical “in-person” channels. As one example of this intriguing development, Sephora has been bridging online and offline behaviors through its so-called “digital mirror” technology over the past several years, recording product interactions in physical stores and connecting them to the online profiles of its customers who have signed up through their Beauty Insider loyalty program. Such integration, while difficult to pull off, ensures that the personalization activities are consistent regardless of whether the customer shops online or in a Sephora store (Wassel, 2019).
Another example is Chipotle, which extended its first-party data collection to its physical store locations via its mobile ordering app. By analyzing in-restaurant and mobile ordering patterns, Chipotle discovered that mobile users were more likely to experiment with new ingredients, while customers ordering in their restaurants were more likely to stick to familiar orders. This insight allowed the chain to develop different promotional strategies for each channel, highlighting new ingredients in mobile interfaces while emphasizing convenience and speed in restaurant environments.
The limited scope of first-party data. Like zero-party data, the primary limitation of first-party data lies in the narrowness of its scope. It only encompasses the interactions of a brand’s customers with specific touchpoints. That is all the information the brand can access about its customers, even when customer interactions at other places or venues are crucial for the successful completion of their journeys. For some companies like Netflix or Spotify, this is not a serious problem. Their activities when visiting the respective platforms can be recorded meticulously and reflect self-contained and comprehensive journeys, beginning and ending on the platform. For other companies, however, this can be a major limitation.
Imagine a luxury brand like Hermes, or a category-specific retailer like Mattress Firm, whose customers purchase from them infrequently (perhaps once every decade in the case of Mattress Firm) and intersperse their within-category purchases with other direct or indirect competitors (the Hermes customer may also buy from Louis Vuitton, Rolex, Chanel or other brands). In such cases, relying on first-party data may be insufficient simply because the typical core customer simply doesn’t interact enough at the company’s touchpoints to generate a sufficiently rich behavioral profile to run personalization campaigns.
Wayfair has faced precisely this challenge. Most of its customers purchase furniture from Wayfair infrequently, creating long gaps between meaningful first-party data collection opportunities. To address this problem, Wayfair developed an “inspiration browsing” feature, encouraging its customers to save rooms and styles they like, even when they were not actively shopping. While such an approach may generate useful first-party data (and even some zero-party data) during the relatively stable parts of long customer journeys, they are still subject to many of the concerns that are normally associated with zero-party data. To summarize, first-party data works best when customer journeys bring them to the company’s touchpoints frequently, and their behavior is rich, regular, and variegated.
As privacy regulations evolve and become tighter and as third-party cookies fade from mainstream use by the major media platforms, brands are increasingly acknowledging that first-party data may be their most sustainable and controllable competitive advantage. Every brand with the ability to do so has invested significant resources in expanding the scope and quality of its first-party data collection efforts and then in using this data for AI-based personalization.
Second-party data
Second-party data occupies a unique position in AI-based personalization. Second-party data is essentially another organization’s first-party data, acquired through a direct partnership with that organization. This approach to sharing data deliberately offers expanded insights while maintaining higher quality standards than is typically found in third-party data (Schultz & Block, 2015). The partnership between Delta Air Lines and Marriott International provides a great example of second-party data’s potential. When a Delta traveler books a flight to Chicago, Delta’s first-party data becomes available as second-party data to Marriott, allowing the hotel chain to deliver personalized hotel recommendations for specific dates to the traveler. Similarly, when Marriott knows a guest has an upcoming stay at one of its properties, that information is delivered as second-party data to Delta, allowing the airline to offer timely transportation options to and from the hotel guest’s point of departure and the destination hotel. It should be obvious that, because of the complementary nature of their services, the partnership is a win-win for both brands, allowing them to expand their personalization capabilities without having to use less reliable third-party data sources (Assur & Rowshankish, 2020).
As another example, online media company Vox Media and financial services provider Northwestern Mutual entered a second-party data partnership focused on content consumption patterns. Northwestern Mutual gained insights into which financial topics generated the most engagement among different reader segments and was able to use technology, including AI and mobile apps, to engage younger, digitally savvy audiences like Gen Z, while Vox received data about NM customers’ financial interests and concerns which could be used to enhance the relevance of its content targeted to specific audiences via its advertising and segmentation frameworks. Thus, the second-party relationship allowed both companies in disparate spaces to collaborate, creating more targeted content strategies and more relevant customer experiences.
The Economics of Data Partnerships. Beyond the immediate, obvious tactical benefits, second-party data partnerships create significant economic advantages for both parties. Recent research suggests that organizations engaged in successful second-party data partnerships report a significant reduction in customer acquisition costs and an improved ROI on personalization efforts compared to those using third-party data approaches (Lim & Pinal, 2023). This cost advantage may stem from a variety of different sources, including higher data quality (when compared to third-party data), improved precision in targeting, and the elimination of intermediary fees typically associated with data brokers.
The economic model supporting these data partnerships has evolved significantly in recent years. A decade ago, traditional second-party arrangements were typically structured as straightforward data trades or purchase agreements. In contrast, current partnerships often employ value-sharing models. According to reports put out recently by various consulting companies from surveying their clients, a majority of second-party data partnerships involve some form of revenue-sharing arrangement, while performance-based incentives are also included in a smaller but significant percentage of partnerships (e.g., Biegel, et al., 2024). Note that the pricing structures used in these relationships are evolving on a monthly basis, and something I will explore further in another article.
Furthermore, it is worth noting that pricing structures that rely on win-win principles like revenue-sharing and rewarding performance are effective in the sense that they help overcome a common barrier to adoption: perceptions of asymmetry in the value exchange between partners. Specifically, when one party in the partnership believes it provides more valuable data than it receives in return, the partnership will naturally falter. By linking compensation to actual performance outcomes, this cause for conflict and the breakdown of the second-party partnership is bypassed, creating more balanced incentives for everyone involved and making for sustainable collaborations.
Consent Management Frameworks. The success of second-party data partnerships increasingly depends on robust consent management frameworks. In one study of industry practices, Forrester (2022) reported that a majority of US consumers are concerned about how their data is shared between companies, making transparent consent practices essential to maintain trust. Many CX-focused organizations address this challenge through customer-centric consent architectures. For instance, credit reporting agency Experian developed its consent framework specifically for second-party data partnerships, allowing consumers to grant or revoke permissions across multiple connected companies with a single action.
According to Experian, this convenience-forward approach increased consent rates from consumers significantly while reducing consent withdrawals (Experian, 2022). Similar innovations are also occurring in retail banking. As one example, Mastercard’s “Data Exchange” platform incorporates tiered consent options, allowing customers to authorize specific types of data sharing between financial institutions and merchants. These granular permissions covering transaction details, location data, and contact information provide consumers with greater control while still allowing partners to maintain compliant, effective data-sharing practices (Mastercard, 2025).
Industry-Specific Applications. Second-party data partnerships have emerged as particularly valuable for CX management in a number of industries, each with unique applications and implementation approaches. In the healthcare industry, pharmaceutical companies and healthcare providers have developed collaborative data models that maintain strict HIPAA compliance while allowing more personalized patient engagement. For instance, Pfizer partnered with the Providence St. Joseph Health hospital network to analyze de-identified patient data alongside medication adherence information, creating predictive models that identify at-risk patients for targeted intervention programs. This collaboration improved medication adherence for patients with chronic conditions while concurrently reducing adverse events (Wei et al., 2024).
The automotive industry has similarly developed innovative use cases with second-party data. For instance, Ford Motor Company’s partnership with insurance provider State Farm created a data exchange where vehicle telematics data, such as driving patterns and vehicle usage of customers, is shared with the insurer to offer usage-based insurance policies. Ford drivers who opt into the offer receive an average 10-15% reduction in premiums, while State Farm gains access to highly accurate driving behavior data that significantly improves their risk models (Adriano, 2022).
Finally, Kroger’s partnership with CPG brands through its Kroger Precision Marketing platform (KPM) transformed the grocery chain’s first-party purchase data, used largely for marketing to its own customer base, into valuable second-party data for over two thousand CPG brands like Nestle and General Mills. For example, a CPG brand might receive insights into which households buy organic products, allowing it to run targeted campaigns. In return, brands may share campaign performance data or product-specific metrics with KPM, creating a reciprocal exchange typical of second-party partnerships (Tinuiti, 2021).
Technical implementation models for second-party data. The technical implementation of second-party data partnerships has evolved substantially from early approaches that relied primarily on batch data transfers and manual integration processes. Current best practices employ three primary models, each with distinct advantages and limitations. First, the clean room approach, pioneered by technology companies like Snowflake and Databricks, creates a neutral computational environment where both parties can analyze combined datasets without directly accessing each other’s raw data. This model has gained a lot of traction in recent years (Snowflake, 2025). Second, API-driven approaches provide a more dynamic implementation model, allowing real-time data exchange between partner systems, although they require significantly higher initial development investment, creating a barrier for smaller organizations. Third, the emerging federated learning model represents the most sophisticated implementation approach, allowing AI models to be trained across multiple partners’ datasets without directly sharing the underlying data (Kairouz et al., 2019). Although this model is still relatively early in its development and adoption, it has the benefit of reducing data transfer requirements significantly while maintaining the accuracy achieved through traditional centralized approaches.
The limitations and challenges associated with second-party data. As should be clear from this discussion, second-party data has tremendous value, versatility, and benefits to the brands involved. However, there are also some limitations to consider and address to realize its full potential. These challenges span technical, operational, legal, and strategic domains, each of which will be briefly considered here. The most immediate challenge in second-party data partnerships involves the technical integration of disparate data systems. According to research by Forrester, a significant proportion of second-party data initiatives encounter delays due to incompatible data formats and structures (Forrester, 2021). These integration challenges often stem from basic differences in how organizations collect and store customer data. Even seemingly basic customer attributes like purchase dates, demographic information, and product categorizations often use incompatible structures or taxonomies. This challenge of integration complexity creates substantial time-to-value delays, with industry analysts reporting that data integration is consistently ranked as one of the top challenges in implementing second-party data partnerships.
Beyond technical limitations, second-party partnerships face significant operational challenges that can undermine their effectiveness. One recent influential study found that organizational alignment issues often contribute more to data partnership failures than technical limitations (Ransbotham et al., 2021). These operational misalignments often manifest in different data refresh cycles, inconsistent identity resolution approaches, and conflicting campaign activation timelines. For instance, home improvement retailer Lowe’s and appliance manufacturer Whirlpool encountered precisely these frictions in their attempted data partnership, where the differences in operational cadences created persistent synchronization issues, ultimately leading to a failed collaboration. Furthermore, the regulatory landscape surrounding second-party data continues to grow more complex, creating significant compliance challenges. According to the International Association of Privacy Professionals (IAPP), navigating the legal requirements for data sharing agreements represents one of the top concerns for privacy professionals implementing second-party data partnerships (Jones et al., 2024). These concerns appear to be well-founded, as regulations like GDPR, CCPA, and emerging state privacy laws create complex compliance requirements for data partnerships (Snowflake, 2025). The World Privacy Forum has documented numerous cases where insufficient consideration of regulatory requirements led to compliance issues in data-sharing arrangements (Kaye & Dixon, 2023). Even sophisticated and largely integrated organizations like Disney and Hulu (before Disney’s complete acquisition) struggled to create unified customer profiles from their respective first-party datasets due to different identity frameworks, permission structures, and data collection methodologies. Their experience highlights how different consent collection practices and interpretations of regulatory requirements can create substantial barriers to effective data partnerships.
Perhaps the most significant limitation of second-party data stems from strategic misalignment between partners. Research has found that misaligned expectations and objectives represent a primary cause of data partnership failures (Davenport & Bean, 2018). As mentioned earlier, this misalignment can often appear as asymmetric value perceptions, with one partner believing they contribute more valuable data than they receive. The lack of standardized methods for evaluating the relative value of different data assets before integration creates substantial risk for organizations considering second-party partnerships, particularly when engaging with partners of different sizes or from different industry verticals.
Given these risks and challenges, it may pay to start incrementally and then expand the relationship as comfort and familiarity with the partner increases. The case of REI and the hiking app AllTrails provides useful guidance for organizations considering second-party data partnerships. At the start of their relationship, instead of attempting to integrate their entire customer datasets, the two brands started collaborating on a narrowly defined use case, delivering personalized experiences to REI customers who frequently hiked in regions where REI was opening new stores. This focused approach limited complexity while delivering clear value to both organizations and their shared customers.
Third-party data
Third-party data, defined as information collected by entities without direct consumer relationships and typically sold through data brokers or platforms, has traditionally been the default solution for brand personalization. While increasingly scrutinized from privacy and regulatory perspectives, it still plays a significant role in the personalization strategies of many companies, even as first- and second-party data gains prominence (Martin & Murphy, 2017). For example, American Express uses third-party demographic and lifestyle data to enhance its understanding of cardholder segments. By enriching its first-party transaction data with third-party insights about income levels, home ownership, and discretionary spending patterns in other categories, American Express creates more nuanced customer segments that power its personalized offers (Altman et al., 2018). The company implemented a sophisticated propensity modeling approach that combines proprietary transaction insights with third-party data on competitive spending patterns. By integrating information from data brokers about cardholder spending with competitors, American Express identified specific merchant categories where its customers frequently use non-Amex cards, resulting in highly targeted rewards and cashback offers designed to shift share-of-wallet to their cards (Davenport, 2020).
This strategy proved particularly effective during the pandemic when American Express observed changing spending patterns through first-party data but lacked the context on broader consumer behavior shifts to explain them or turn them into actionable insights. Third-party data from consumer panels and retail analytics firms helped the company realize that certain spending decreases represented category contractions rather than competitive losses, which in turn led to more nuanced retention strategies in a volatile environment (American Express, 2021). The company has also pioneered the use of “anonymized identity graphs” that connect third-party data to first-party profiles without relying on persistent identifiers like cookies. This approach, developed in anticipation of increasing privacy restrictions, allows American Express to maintain personalization capabilities while gradually reducing dependence on traditional third-party identifiers (Joseph, 2019).
Media agency Mindshare demonstrated the value of third-party data in audience expansion through its work with a luxury automotive brand. When the brand’s first-party data identified its most valuable customer segment, Mindshare used third-party data to identify “look-alike” audiences with similar characteristics but no previous brand interaction. This approach expanded the personalization reach beyond existing customers while maintaining targeting precision.
However, the limitations of third-party data have become increasingly apparent as it has grown in popularity. Hotel chain Hilton discovered significant accuracy issues when testing third-party income data against verified income information from loyalty program members who applied for its co-branded credit card. The discrepancies were substantial enough that Hilton scaled back its use of third-party data for personalization, focusing instead on expanding its first-party data collection. It is worth noting that this issue of invalid or missing data has always plagued third-party data simply because it is hard to verify the provenance of this data and understand all the processes used for data collection and validation by the vendor.
Increasing regulations present another significant challenge. As mentioned earlier in this note, Beauty retailer Sephora faced a $1.2 million settlement in California for alleged violations related to its use of third-party tracking technologies. This case is one among many, highlighting the growing compliance risks associated with third-party data. It can even be argued that the liability of third-party data has accelerated the shift of many brands and companies toward zero-party and first-party strategies and away from third-party data.
Perhaps most concerning for marketers relying on third-party data, research conducted by various consultants like McKinsey and Deloitte and by the Association of National Advertisers over the past five years or so consistently suggests that personalization initiatives based primarily on third-party data usually deliver lower ROI than those centered around first-party data. Contrarily, and not surprisingly, first-party data is far more effective for CX personalization. Furthermore, these studies have attributed the performance gap between first-party data and third-party data to a number of factors that may be difficult to remedy, such as data quality issues, targeting inaccuracies, and growing consumer resistance to personalization that feels surveillance-based rather than service-oriented.
Despite these challenges, marketers and customer experience managers have used third-party data for decades and remain relevant for specific applications. For instance, unlike Netflix, which emphasizes first-party data, Hulu uses third-party viewership trends and content preference data to inform content recommendations for new subscribers with limited viewing history. This addresses the “cold start” problem that hampers personalization for new users with minimal first-party data still available.
The announcement by Google regarding the planned deprecation of third-party cookies in its Chrome browser, set initially for 2023 and subsequently postponed to 2024, marked a significant shift in the relative importance of third-party data (Warmouth, 2024). This move, aligned with prior actions by Apple’s Safari and Mozilla’s Firefox browsers, represented a significant reorientation in the operational framework behind digital advertising. Surveys such as those conducted by IAB over the 2022-2024 period consistently found that most marketing professionals expect substantial adverse effects on their existing personalization strategies when third-party cookies are eliminated completely.
Brands have responded to these contextual changes in obvious ways, such as by expanding their first-party data collection initiatives and investing more in their loyalty programs. Retailers like Target and Home Depot incentivize their customers to identify themselves in their different channels, particularly in their stores, to reduce reliance on third-party data while maintaining personalization capabilities (e.g., Wassel, 2025). As another example, The New York Times shifted from third-party audience segments to developing proprietary first-party audience segments based on reader behavior, reporting a 20% increase in campaign performance metrics (Southern, 2020).
The privacy-focused shift away from third-party data is also evident in Apple’s implementation of its App Tracking Transparency framework, which requires explicit user permission for cross-app tracking. This feature resulted in approximately 96% of U.S. users opting out of tracking, substantially reducing the flow of behavioral data into third-party ecosystems (Axon, 2021). The impact has been significant for mobile advertisers previously dependent on this data stream. Meta reported revenue losses of approximately $10 billion in 2022 alone due to these privacy changes (Leswing, 2022). In response to these challenges, data brokers like Acxiom and Epsilon have pivoted toward “privacy-preserved” approaches to third-party data usage. These methods, including clean rooms, differential privacy, and federated learning, aim to maintain personalization capabilities while addressing privacy concerns.
Thus, what role third-party data will play in AI-based personalization going forward is a bit murky. On the one hand, it is facing declining relevance for broad demographic targeting, but at the same time, it is increasing in importance for enriching available customer information from other data types and expanding into adjacencies. Organizations with sophisticated data strategies increasingly adopt a hybrid approach where they reserve third-party data for specific, high-value use cases and try to use zero-party and first-party data for their core personalization campaigns (Assur & Rowshankish, 2020).
Combining data to create comprehensive customer profiles
Today's most sophisticated personalization strategies typically use multiple types of data. The idea is to integrate many data sources and types to create a more complete customer profile. This approach creates multiplicative utility in the form of personalization capabilities that are greater than the sum of their parts (Wedel & Kannan, 2016). Home Depot’s DIY project support ecosystem offers an excellent case study. The retailer collects zero-party data through project questionnaires, first-party data through purchase history and website interactions, second-party data through partnerships with service providers, plus carefully vetted third-party data. This multi-pronged data strategy has set the company up for personalization now and for the future, with campaigns addressing what products a customer might need, when they might need them, how they will use them, and what expertise they require (Dawar, 2018). Furthermore, Home Depot’s implementation has evolved substantially since its initial rollout. The retailer now employs project prediction algorithms that synthesize purchase history with third-party property data to identify likely home improvement projects before they begin. For instance, by combining its first-party data showing a customer’s recent measuring tool purchases with third-party data providing estimates of their home’s age and square footage, Home Depot can accurately predict an upcoming kitchen remodel. This prediction triggers a personalized communication sequence offering suitable products, services, and guidance timed to the project’s estimated progression.
Along similar lines, Peloton5 has combined different types of data to create highly personalized experiences, including zero-party data from preference settings, first-party data from workout behaviors, second-party data from music streaming partners, and third-party weather data, all of which inform the company’s recommendation algorithms. Because of this integration, Peloton can suggest not just any workout but the right workout for each subscriber’s preferences, fitness level, available time, and even local weather conditions. Peloton’s data integration also extends to its hardware products. The company correlates performance metrics from equipment sensors with third-party biometric data (when users connect compatible devices) to create personalized resistance and intensity recommendations. When third-party data indicates unusual heart rate patterns or recovery metrics, Peloton’s algorithms dynamically adjust workout recommendations to prevent overtraining, which would be impossible without this integrated approach (Chand, 2024).
Like Peloton, the case study of the now-defunct financial tracking service Mint still provides a useful example of how data integration can create entirely new value propositions (Its shuttering had to do with its pricing strategy, not its CX strategy)6. By combining first-party transaction data, zero-party data about financial goals from customers, and third-party benchmark information, Mint generated personalized insights that would be impossible to obtain with any one type of data source. A recommendation to reduce dining expenses, for example, combined the subscriber’s actual spending data (first-party data), stated budget goals (zero-party data), and contextual information about typical spending patterns in the user’s location and income bracket (third-party data).
The company also worked deliberately to continuously evolve and improve its integrated approach, a cornerstone of CX management. Even at the point when it went dark, the software was still adding third-party financial data sources, including rental payment history and utility payment records, to provide more comprehensive financial guidance. For those interested in building their credit, the company combined these alternative data points with traditional credit information to provide personalized “credit builder” recommendations, prioritizing actions based on each user’s specific credit profile and financial behavior patterns. Its failure was not due to personalization, but due to a faulty pricing model; that’s a story for another day.
On the negative side, data integration can also introduce significant technical and organizational challenges. Media conglomerate NBCUniversal struggled for years to create a unified view of customers across its broadcast, cable, streaming, theme park, and merchandise divisions due to siloed data architectures, inconsistent identity frameworks, and varying levels of customer consent across properties. The integration process at NBCUniversal hit several specific obstacles common to cross-channel personalization efforts that will be familiar to those working with large customer databases. For instance, the company discovered that third-party data providers used inconsistent geographic taxonomies across divisions, creating situations where the same household might be classified in different demographic segments depending on which division acquired the data. Similarly, identity resolution proved particularly challenging when matching streaming service accounts (often shared across households) with theme park visitors (identified through different systems). NBCUniversal ultimately created a dedicated “data integration council” with representatives from each division to establish standardized frameworks for third-party data acquisition and integration. Many brands and companies of all sizes face similar challenges as they struggle to implement truly integrated personalization strategies despite their obvious theoretical appeal.
Managing the privacy-personalization paradox
Having traced the evolution, applications, and potential limitations of zero-party, first-party, second-party, and third-party data in this article, it is clear that these data types will continue to fuel AI-based personalization of customer experiences in the coming years, even as AI blossoms into something we cannot yet conceive. Things continue to evolve rapidly on many fronts. There are new regulations, technological innovations, changing consumer expectations, and competitive dynamics to deal with, all of which collectively demand a recalibration of how to design and implement personalization strategies.
The degradation of third-party data’s value, evidenced by accuracy problems faced by Hilton, compliance challenges faced by Sephora, and the imminent deprecation of third-party cookies by major browsers, has accelerated the industry-wide shift toward first-party and zero-party approaches. This transition represents not simply a tactical change but rather a strategic recalibration in how organizations think of the relationship between data, privacy, and personalization.
In this new landscape, the brands positioned to thrive will be those that recognize personalization and privacy as complementary rather than competing objectives. As demonstrated by companies like American Express with its “anonymized identity graphs” and Target with its comprehensive “guest profiles,” sophisticated data strategies increasingly emphasize consent clarity, purpose limitation, and data minimization while maintaining robust personalization capabilities. These principles, once viewed primarily through a compliance lens, are increasingly recognized as competitive differentiators that build consumer trust and loyalty.
The technical implementation of personalization is also evolving toward more architectures that preserve customer privacy. The emergence of clean room environments for second-party data partnerships, federated learning models that extract insights without centralizing sensitive information, and on-device processing that keeps personal data under the customer’s control all point toward a future where CX personalization relies less on massive data stockpiles and more on intelligent, focused analysis of customer information.
This technical evolution is accompanied by a philosophical recalibration of what constitutes “effective” personalization. Where earlier approaches often emphasized the breadth of data collection and granularity of targeting, emerging models prioritize contextual relevance, consent quality, and transparent value exchanges. Personalization strategies that clearly articulate the consumer benefit derived from data sharing, like CVS Health’s opt-in health insights program, achieve superior engagement precisely because they respect consumer agency and preference.
The integration of multiple data types will continue to provide a competitive advantage, as evidenced by Home Depot’s multi-layered approach that combines zero-party project questionnaires, first-party purchase history, second-party service provider partnerships, and targeted third-party property data. However, this integration will increasingly follow privacy-by-design principles, with third-party data relegated to specific gap-filling functions rather than core personalization capabilities.
Perhaps most significantly, as AI capabilities continue to advance, we are likely to witness a fundamental inversion of the traditional personalization paradigm. Rather than organizations collecting ever-expanding datasets to power their AI models, we may see the emergence of personal AI agents that serve as intermediaries between consumers and the brands seeking to engage them. These agents, operating with the individual’s explicit consent and in line with their stated preferences, could negotiate personalization parameters on the user’s behalf, ensuring relevance while protecting privacy boundaries.
For customer experience leaders navigating this complex future, several strategic implications emerge. First, investments in first-party data infrastructure and systematic zero-party collection mechanisms could deliver a superior ROI when compared to those in legacy third-party vendors and data sources. Relatedly, a shift from amount-based metrics to quality-adjusted metrics is imminent, with significant implications for pricing strategy in this space. Second, it is likely that transparency will become a non-negotiable part of managing customer personalization campaigns, with organizations that clearly communicate data usage, provide granular consent options, and deliver tangible value to their customers in exchange for data sharing, achieving superior customer engagement and financial returns from their CX activities. Third, the ethical governance of data is likely to transition from a compliance function to a core strategic capability, with cross-functional oversight bodies like the one at NBCUniversal becoming standard in organizations with ambitious personalization ambitions.
The organizations that thrive in this evolving landscape will be those that view data not as something to be collected in ever-increasing volumes, but as a responsibility to be managed thoughtfully in the service of providing genuinely valuable customer experiences. As regulations continue to evolve and consumer expectations rise, the ability to deliver deeply personalized experiences with minimal data collection will emerge as a key competitive differentiator, separating organizations that merely gather data from those that transform it into meaningful customer value.
References
Acquisti, A., Taylor, C., & Wagman, L. (2016). The economics of privacy. Journal of Economic Literature, 54(2), 442-92.
Adams, P. (2024). PepsiCo experiments with smart cans, AI tech to improve personalization. Marketing Dive, June 24. Available online at: https://www.marketingdive.com/news/pepsi-gatorade-AI-assistant-smart-can-cannes-lions/719533/
Adriano, L. (2022). State Farm announces huge new Ford partnership. Insurance Business, February 28. Available online at: https://www.insurancebusinessmag.com/us/news/breaking-news/state-farm-announces-huge-new-ford-partnership-326873.aspx
Altman, E. J., Nagle, F., & Tushman, M. L. (2018). Innovating without information constraints: Organizations, communities, and innovation when information costs approach zero. Oxford Handbook of Creativity, Innovation, and Entrepreneurship: Oxford University Press.
Amante, D. J., Harlan, D. M., Lemon, S. C., McManus, D. D., Olaitan, O. O., Pagoto, S. L., Gerber, B. S., & Thompson, M. J. (2021). Evaluation of a diabetes remote monitoring program facilitated by connected glucose meters for patients with poorly controlled type 2 diabetes: randomized crossover trial. JMIR Diabetes, 6(1), e25574.
Amatriain, X., & Basilico, J. (2016, September). Past, present, and future of recommender systems: An industry perspective. In Proceedings of the 10th ACM conference on recommender systems (pp. 211-214).
American Express. (2021). Annual Report 2020. Available online at: https://s26.q4cdn.com/747928648/files/doc_financials/2020/ar/2020-Annual-Report-(1).pdf
Association of National Advertisers. (2021). The future of programmatic report, 2021 edition. ANA Research Report. Available online at: https://www.ana.net/miccontent/show/id/rr-2021-future-programmatic.
Assur, N., & Rowshankish, K. (2020). (2020). The data-driven enterprise of 2025. McKinsey Digital. Available online at: https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/the-data-driven-enterprise-of-2025
Axon, S. (2021). 96% of US users opt out of app tracking in iOS 14.5, analytics find, Ars Technica, May 7. Available online at: https://arstechnica.com/gadgets/2021/05/96-of-us-users-opt-out-of-app-tracking-in-ios-14-5-analytics-find/
Barnett, K. (2022). Marketing leads at Pepsi and P&G share priorities for building retail media strategies. The Drum, October 26. Available online at: https://www.thedrum.com/news/2022/10/25/marketing-leads-pepsi-pg-share-priorities-building-retail-media-strategies
Bloudoff-Indelicato, M. (2019). The science behind your favorite workout playlist. Outside Magazine, January 31. Available online at: https://www.outsideonline.com/health/training-performance/how-music-can-help-your-workout/
Branscum, K. (2024). The irony of data privacy and personalization. Typeform. Available online at: https://www.typeform.com/blog/irony-of-data-privacy-and-personalization
Brough, A. R., & Martin, K. D. (2021). Consumer privacy during (and after) the COVID-19 pandemic. Journal of Public Policy & Marketing, 40(1), 108-110.
California Attorney General. (2022). Attorney General Bonta Announces Settlement with Sephora as Part of Ongoing Enforcement of California Consumer Privacy Act. State of California Department of Justice. Available online at: https://oag.ca.gov/news/press-releases/attorney-general-bonta-announces-settlement-sephora-part-ongoing-enforcement.
Chand, D. (2024). How Peloton uses data to get customers. The Digital Chapter, July 5. Available online at: https://www.thedigitalchapter.com/p/peloton-dtc-growth-marketing-strategy
Davenport, T. H. (2020). The AI Advantage: How to Put the Artificial Intelligence Revolution to Work. MIT Press.
Davenport, T. H., & Bean, R. (2018). Big companies are embracing analytics, but most still don’t have a data-driven culture. Harvard Business Review Digital Articles. Available online at: https://hbr.org/2018/02/big-companies-are-embracing-analytics-but-most-still-dont-have-a-data-driven-culture.
Dawar, N. (2018). Marketing in the age of Alexa. Harvard Business Review, 96(3), 80-86.
Dholakia, U. (2023). Transparency in Business: An Integrative View. Springer Nature.
Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method, 4th Edition. Wiley.
Duhigg, C. (2012). How companies learn your secrets. The New York Times Magazine, February 16. Available online at: https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html
Experian (2022). Treating data with respect. Experian Annual Report 2022. Available online at: https://www.experianplc.com/content/dam/marketing/global/plc/en/assets/documents/reports/2022/annual-report/reports/022-treating-data-with-respect-1091-T01.html
Forrester. (2021). The future of privacy and data protection: Growth and competitive differentiation. Forrester Research Report. Available online at: https://www.forrester.com/report/the-future-of-privacy-and-data-protection-growth-and-competitive-differentiation/RES61244
Forrester. (2022). 2022 US Consumer Privacy Segmentation. Forrester Research Report. Available online at: https://www.forrester.com/report/2022-us-consumer-privacy-segmentation/RES178258
Gomez-Uribe, C. A., & Hunt, N. (2015). The Netflix recommender system: Algorithms, business value, and innovation. ACM Transactions on Management Information Systems, 6(4), 1-19.
Hartmans, A., Tobin, B., Mayer, G., & Jackson, S. (2024). The rise and fall of Peloton, from pandemic-era success story to its stock hitting a record low. Business Insider, May 2.
Higley, A., & Snyder, N. (2024). HelloFresh review: I tried a meal prep service for a week - here’s what I thought. Taste of Home. Available online at: https://www.tasteofhome.com/article/hellofresh-meal-prep-service-review/
Hoofnagle, C. J., van der Sloot, B., & Borgesius, F. Z. (2019). The European Union general data protection regulation: What it is and what it means. Information & Communications Technology Law, 28(1), 65-98.
IAB. (2022). State of Data 2022: The Data Privacy Landscape. Interactive Advertising Bureau. Available online at: https://www.iab.com/insights/state-of-data-2022/
Jennings, L. (2022). How Chipotle built its $3 billion digital business. Nation’s Restaurant News, May 10. Available online at: https://www.nrn.com/top-500-restaurants/how-chipotle-built-its-3-billion-digital-business
Jones, J., Kanthaswamy, S., Saniuk-Heinig, C., & Fischer, L. (2024). Privacy governance report 2024. IAPP Research Report. Available online at: https://iapp.org/resources/article/privacy-governance-report/
Joseph, S. (2019). How American Express is preparing for a world without cookies. Digiday, November 6. Available online at: https://digiday.com/media/how-american-express-is-preparing-for-a-world-without-cookies/
Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and trends® in machine learning, 14(1–2), 1-210.
Keeter, S., Lau, A., & Kennedy, C. (2021). What 2020’s election poll errors tell us about the accuracy of issue polling. Pew Research Center Report. Available online at: https://www.pewresearch.org/methods/2021/03/02/what-2020s-election-poll-errors-tell-us-about-the-accuracy-of-issue-polling/
Korganbekova, M., & Zuber, C. (2023). Balancing user privacy and personalization. Working paper, Northwestern University. Available online at: https://bit.ly/4bElxaY
Kroger. (2021). Kroger Precision Marketing expands private marketplace reach. Kroger Newsroom. Available online at: https://ir.kroger.com/CorporateProfile/press-releases/press-release/2021/Kroger-Precision-Marketing-Expands-Private-Marketplace-Reach/default.aspx.
Lake, K. (2018). Stitch Fix’s CEO on selling personal style to the mass market. Harvard Business Review. https://hbr.org/2018/05/stitch-fixs-ceo-on-selling-personal-style-to-the-mass-market
Leswing, K. (2022). Facebook says iOS privacy change will result in $10 billion revenue hit this year. CNBC, February 2. Available online at: https://www.cnbc.com/2022/02/02/facebook-says-apple-ios-privacy-change-will-cost-10-billion-this-year.html
Lim, I., & Pinal, J. (2023). First-party data and customer trust. Deloitte Research Report. Available online at: https://www2.deloitte.com/ph/en/pages/technology-media-and-telecommunications/articles/first-party-data-apac.html
Lindecrantz, E., Tjon Pian Gi, M., & Zerbi, S. (2020). Personalizing the customer experience: Driving differentiation at retail. McKinsey Report. Available online at: https://bit.ly/43Tobrm
Martin, K. D., & Murphy, P. E. (2017). The role of data privacy in marketing. Journal of the Academy of Marketing Science, 45(2), 135-155.
Mastercard (2025). Consent management documentation: Use cases. Available online at: https://developer.mastercard.com/consent-management/documentation/use-cases/
McKinsey & Company. (2020). The data-driven enterprise of 2025. McKinsey Digital. Available online at: https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/the-data-driven-enterprise-of-2025.
Michu, S. (2024). P&G’s AI-driven approach to consumer insights. CIO Magazine, December 17. Available online at: https://www.cio.inc/pgs-ai-driven-approach-to-consumer-insights-a-27080
Mozilla. (2022). Firefox and privacy: Our approach to privacy-preserving features. Mozilla Blog. Available online at: https://blog.mozilla.org/en/products/firefox/firefox-privacy-approach/.
Netflix (2023). What we watched: A Netflix engagement report, December 12. Available online at: https://about.netflix.com/en/news/what-we-watched-a-netflix-engagement-report
Paypal (2024). Paypal data protection addendum for payment processing products, November 18. Available online at: https://www.paypal.com/us/legalhub/paypal/data-protection
Poole, T. (2017). How businesses are leveraging digital payments. Capital One blog. Available online at: https://www.capitalone.com/tech/software-engineering/digital-payments-in-an-order-ahead-world/
Ramaswamy, S. (2016). How micro-moments are changing the rules. Think with Google. Available online at: https://www.thinkwithgoogle.com/marketing-strategies/app-and-mobile/how-micromoments-are-changing-rules/.
Ransbotham, S., Khodabandeh, S., Kiron, D., Candelon, F., Chu, M., & LaFountain, B. (2020). Expanding AI’s impact with organizational learning. MIT Sloan Management Review. Reprint Number 62270.
Rust, R. T., & Huang, M. H. (2014). The service revolution and the transformation of marketing science. Marketing Science, 33(2), 206-221.
Schrage, M., & Kiron, D. (2018). Leading with next-generation key performance indicators. MIT Sloan Management Review Research Report. Available online at: https://sloanreview.mit.edu/projects/leading-with-next-generation-key-performance-indicators/.
Schultz, D. E., & Block, M. P. (2015). Beyond brand loyalty: Brand sustainability. Journal of Marketing Communications, 21(5), 340-355.
Smith, B., & Linden, G. (2017). Two decades of recommender systems at Amazon.com. IEEE Internet Computing, 21(3), 12-18.
Snowflake (2025). Data clean rooms explained: What you need to know about privacy-first collaboration. Snowflake blog post. Available online at: https://www.snowflake.com/en/blog/data-clean-rooms-privacy-collaboration/
Soper, T. (2024). Starbucks mobile orders surpass 30% of total transactions at U.S. stores, GeekWire. Available online at: https://bit.ly/4kAZJBc
Southern, L. (2020). Inside the New York Times’ first-party data play. Digiday, July 17. Available online at: https://digiday.com/media/the-time-is-right-inside-the-new-york-times-first-party-data-play/
Stitch Fix (2024). How to find your personal style. Available online at: https://www.stitchfix.com/women/blog/fashion-tips/how-to-find-your-personal-style/
Tinuiti (2021). Kroger precision marketing satisfies advertisers’ hunger for targetable audiences. June 14. Available online at: https://tinuiti.com/blog/commerce/kroger-precision-marketing/
Vollero, A., Sardanelli, D., & Siano, A. (2023). Exploring the role of the Amazon effect on customer expectations: An analysis of user‐generated content in consumer electronics retailing. Journal of Consumer Behaviour, 22(5), 1062-1073.
Warmouth, B. (2024). Google ends its third-party cookies deprecation plans for Chrome. Digital Commerce 360, July 24. Available online at: https://www.digitalcommerce360.com/2024/07/24/third-party-cookies-deprecation-google-chrome/
Wassel, B. (2019). AI-powered digital mirror ‘reads’ Sephora shoppers’ look. Retail Touchpoints, March 27. Available online at: https://www.retailtouchpoints.com/topics/customer-experience/ai-powered-digital-mirror-reads-sephora-shoppers-look
Wassel, B. (2025). Target aims to triple its paid loyalty program membership in three years. CX Dive, March 5. Available online at: https://www.customerexperiencedive.com/news/target-circle-triple-paid-loyalty-program/741704/
Wayfair. (2021). Personalizing the home goods shopping experience. Wayfair Engineering Blog. Available online at: https://tech.wayfair.com/data-science/2021/09/personalizing-the-home-goods-shopping-experience/.
Wedel, M., & Kannan, P. K. (2016). Marketing analytics for data-rich environments. Journal of Marketing, 80(6), 97-121.
Wei, Q., Mease, P. J., Chiorean, M., Iles-Shih, L., Matos, W. F., Baumgartner, A., & Hadlock, J. (2024). Machine learning to understand risks for severe COVID-19 outcomes: a retrospective cohort study of immune-mediated inflammatory diseases, immunomodulatory medications, and comorbidities in a large US health-care system. The Lancet Digital Health, 6(5), e309-e322.
Whelan, R. & Toonkel, J. (2024). How Disney is trying to as addictive as Netflix. Wall Street Journal, July 16. Available online at: https://www.wsj.com/business/media/inside-disneys-mission-to-keep-viewers-glued-to-their-screens-59a29206?
Yoon, H., & Meyvis, T. (2024). Consuming Regardless of Preference: Consumers Overestimate the Impact of Liking on Consumption. Journal of Consumer Research, 51(3), 474-496.
Note that this document was updated in March 2025; however, it is virtually certain that the information it contains will become dated within a few months, and as such, it will be updated regularly.
There are many interesting examples of this so-called intention-behavior gap, not only in the academic literature but also in practice. Consumers routinely say one thing but do another, something that can frustrate product managers and CX designers. For example, over 70% of people say they want to eat healthier food, favoring organic or plant-based options, yet processed snacks and fast food dominate their purchases, as convenience and taste dominate the aspiration to be healthy. Likewise, in surveys, a majority of Americans say they want to exercise more, sign up for gyms, etc. But most of those who have joined a gym in January have quit exercising by March, with fitness app growth also following this seasonal stall as customer willpower fades to their default couch inertia. This gap, pledging to eat kale and do cardio but choosing to eat fries and watch Netflix, shows consumers’ aspirational ideals clashing with their habits. For a brand like HelloFresh, this could mean customers ticking boxes for harissa-spiced lamb in surveys (aspirational and exotic fare) but ordering creamy chicken alfredo (familiar and comforting) when it’s time to cook after a long day. Recipe ratings might skew toward dishes like one-pan beef enchiladas over Vietnamese pho, even if the latter sounds more exciting in theory.
I could write a whole piece on this and its attendant ramifications for customer experience management, and I probably will. As witnessed by the well-publicized failure of many top-tier political polls in the three previous presidential elections (e.g., Keeter, et al., 2021), survey-based data has become suspect for many reasons, and the statistical remedies are often not sufficient. What is true for political polls is even more so for consumer surveys.
For instance, I participated as a member of Netflix’s web panel for six months in 2019, allowing them to track my household’s viewing behaviors combined with periodic surveys and chats in exchange for a free subscription and some cash payments.
Some may argue that Peloton is a bad example to use, given its struggles and fall from its 2020 peak. However, as many business writers have noted, its story doesn’t have to do with cutting-edge personalization, but bad strategy and bad luck (see, for example, Hartmans et al., 2024).
I know I keep placing companies that have failed or even gone defunct as poster children of personalization. The reader may not be faulted for thinking that success at personalization may not mean much if it is not supported by a strong business strategy. I believe this point is true and needs further exploration. However, I want to note that I have also referenced numerous companies like Amazon, Netflix, Chipotle, and Home Depot throughout this article that have been tremendously successful with both personalization and financial performance.