Algorithmic Transparency & Consumer Disclosure
This is an edited version of Chapter 6. Algorithmic Transparency and Consumer Disclosure from my book Transparency in Business: An Integrative View (2023), Palgrave MacMillan.
Abstract. This chapter reviews the growing and evolving literature on algorithmic transparency and consumer disclosure. As algorithms are adopted widely in data-driven business decision making, often involving consumers, the questions of which data is collected and used, how it is analyzed, and the various unintended consequences it produces all gain in significance. In the endeavor to formulate a better understanding of business transparency, this chapter considers the research on the various means to formulate and increase algorithmic transparency and the challenges of doing so, along with the host of associated issues concerning consumers whose private data is at stake.
“The near-ubiquitous use of algorithmically driven software, both visible and invisible to everyday people, demands a closer inspection of what values are prioritized in such automated decision-making systems.” – Safiya Umoja Noble.
Consider this ProPublica investigative story from October 2022. It reported on the popularity of a dynamic pricing software called YieldStar, used by property managers throughout the United States, that is responsible for steep increases in apartment rents in cities from Seattle to Atlanta to Denver (Vogell, 2022). The service relies on an algorithm that considers apartment characteristics like square footage, the number of bedrooms, the floor level, etc., the forthcoming availability of apartments within the complex, and the locations and actual rents paid for apartments in that complex and at proximal competitors to calculate the rent to charge for each apartment. The rent recommendations are dynamically updated daily by the software.
Furthermore, the company recommends its clients avoid negotiating with potential renters and favor lower occupancy over lower rents. The algorithm runs using lease transaction data from over 13 million rental transactions, which contains actual rents paid (vs. advertised rents; Kovatch, 2022), but the details of how it calculates its output, i.e., precisely which variables are used, how they are combined, and how the rent estimates are calculated, remain entirely opaque to landlords using the software and their renters. The upshot is that tens of thousands of US renters have suffered significant hardship by paying significantly higher rents for the same apartments or undertaking an inconvenient and costly move with little understanding of why rents are increasing.
As this case illustrates, on the frontlines in the battle between disclosure and secrecy, the fiercest skirmishes are being fought in using algorithms for business purposes. Algorithms, defined as “encoded procedures for transforming input data into a desired output, based on specified calculations… [that] name both a problem and the steps by which it should be solved” (Gillespie, 2014, p. 167), have been adopted in a variety of business applications. Their popularity is driven by a confluence of factors, including a managerial preference for data-driven decision making and personalized offerings, the greater availability of behavioral data (like the YieldStar example), and services that provide cheap data storage and processing (Edelman & Abraham, 2022).
Algorithms driven by artificial intelligence are used routinely to moderate consumers’ posts, comments, and reviews on content and social media sites, detect and prevent fraudulent financial transactions, audit contracts and financial statements, assess the creditworthiness of buyers, trade financial instruments, recognize voices and faces for automated conversations and identity verification, screen rental applications, set insurance rates by monitoring driving behavior, and set and change prices dynamically (Gillespie, 2020), leading Pasquale (2015) to coin the term “the scored society” for the ubiquitous use of algorithmic scoring in business practice.
As algorithm use proliferates in business, so do collateral concerns from those affected by their outputs. They include concerns about privacy, particularly the fact that many algorithmic applications rely on comprehensive but proprietary private and personal data about their targets, and the collection, storage, and use of the data is often based on insufficient explanation and often violates contextual norms of privacy (Martin, 2012). Equally salient is the issue of transparency for both parties: those who adopt and use algorithms and those who are affected by them. For algorithm users, typically businesses or third parties that sell services to them, there are questions about how much information about the algorithm and the data used to provide its output can and should be disclosed. For those affected by algorithm outputs, such as customers or renters, the core questions are about understanding which of their personal information is collected, what other data sources are used, and how exactly the algorithm calculates scores for them that affect their choices. These are evolving topics, and there is growing scholarly literature studying them, which loosely falls under the title of “algorithmic transparency,” which we will consider in this chapter. It should be noted that, as a whole, this academic literature has more questions than answers and is a few steps behind practice, which continues to advance rapidly; however, it is still worth understanding current perspectives to round out our understanding of business transparency.
The Algorithmic Transparency Challenge
In one sense, applying algorithms with artificial intelligence to make business decisions has the veneer of openness and objectivity we have encountered throughout this book. After all, many aspects of algorithms are “objective;” they are well-defined computational procedures tasked with solving computational problems that are equally well-articulated, and the inputs, outputs, and goals all need to be written out mathematically (Cormen et al., 2009). What could be more visible, communicable, and interpretable than a mathematical formula that produces clearly defined actions and replaces subjective human decision making that could be susceptible to biases? In the YieldStar case, for example, it can be argued that potential racial discrimination in quoting rent is bypassed completely because renter demographic characteristics are not included in the model.
Yet, a serious concern is that despite the good intentions of managers and statisticians, using algorithms may unintentionally lead to harmful consequences for those affected by these applications, such as the inadvertent promotion of bias and discrimination in various forms, the propagation of falsehoods such as domain-specific misinformation, conspiracy theories, and toxic speech, and the promotion of inequity by reducing access to information and services to certain groups (Commerford et al., 2022; Green & Viljoen, 2020; Lewandowsky & Kozyreva, 2022; Seele et al., 2020). As Gillespie (2014) puts it, there are “warm human and institutional choices that lie behind these cold mechanisms” (p. 169) that need to be considered and understood. Even in the case of the YieldStar algorithm that excludes demographic variables, the rapid increase in rents because of its dynamic pricing outputs has disproportionately affected lower-income Black and Hispanic renters (Mijares Torres & Marte, 2022).
Compounding this challenge, many artificial intelligence-based methods, such as deep learning, are “inscrutable” (Raj, 2020) and inherently lacking explainability. The combined opacity in the mechanism and the surrounding human and social factors make it impossible for the working, rationale, or output to be learned or understood by anyone, including the coders and implementers of these algorithms. Furthermore, even within a single application domain, the concurrent use of different algorithms, AI methods, and algorithmic decision making (ADM) systems means there is often significant variation in transparency within the same setting (Gorwa, Binns, & Katzenbach, 2019).
All of this is to say that in any given application, understanding the transparency of algorithms being used is itself ridden with dilemmas and open questions. Pasquale (2015) provides a concrete example of this challenge by asking, “Should a credit card company be entitled to raise a couple’s interest rate if they are seeking marriage counseling? If so, should cardholders know this?” (p.5). Such questions can easily be extended into a lengthy series in this context or extrapolated to other business applications of algorithms.
Defining Algorithmic Transparency
By now, it should be clear that beyond technical considerations such as scalability, efficiency, and accuracy, the future broad-based adoption and success of business applications of algorithms, its regulatory approval by governments, and by extension, its acceptance by consumers or employees who are affected by its outputs in consequential ways, hinges on achieving algorithmic transparency. Thus, as a starting point, we need to seek clarity about the concept itself and understand current conceptualizations of it.
At a basic level, algorithmic transparency refers to the user’s and, to a lesser and more abstract extent, the target’s ability to understand the inputs, outputs, and the process followed by the model to convert inputs into outputs (Arrieta et al., 2020) and an actionable way to hold users accountable. Diakopoulos (2020) persuasively argues that managers must adopt transparency and accountability as key considerations when adopting ADM systems at the outset and then use them (see also Ananny & Crawford, 2018).
Transparency in ADM Systems
ADM systems typically use an algorithm-based process to generate a judgment or decision such as a dollar value (in the case of price), score, ranking, or classification, which is then used as an input into the creation or cocreation of digitally automated services such as the ones described earlier. For example, ADM systems calculate and change prices in dynamic pricing applications. In some cases, the algorithm’s output may be marketed as the core service itself, for example, a potential renter’s “shadow credit score” that is sold to landlords for screening purposes (Smith & Vogell, 2022) or an estimate of the asking rent using the YieldStar software (Vogell, 2022). In other applications, the algorithm’s output may be used as part of a more extensive service, such as recommendations about other things to buy or watch used by online retailers and streaming services (Vanderbilt, 2016).
Some researchers studying algorithmic transparency rely on a core availability-based definition, defining algorithmic transparency as “the availability of information about an actor allowing other actors to monitor the workings or performance of this actor” (Meijer, 2014). Another line of research focuses on the opposite end of the conceptual spectrum by considering algorithmic opacity, defined as the lack of knowledge of the involved parties (those writing and using the algorithm and those affected by it) stemming from the partial or total ignorance about the inputs used in the algorithm, and a definite sense of how or why a particular outcome was derived from the inputs. Yet, there are specific nuances of “availability” (also called “explainability” in some strains of this literature) in this view of transparency that need to be teased out. For example, some of this ignorance is remediable, while some is not, and which party (user or target) is the focus of the remediation also matters.
Warm Human and Institutional Factors
When compared to other types of business transparency covered in this book, one unique aspect of algorithmic transparency is that it is reasonable to conceive of it not just as the algorithm’s property (as, say, it is for a business leader or a supply chain), but instead, it is the property of the overall system and the sum of processes that are employed in its service. These include the means used by the modeler to develop the algorithm (including the modeler’s mindset and ideological orientation), the specific mathematical steps embedded in the algorithm, the sources, and the management of the relevant data (for instance, how the data is cleaned, missing values dealt with, etc.), and the method of utilization of the algorithm’s output in the cocreation of the service (Creel, 2020).
Thus, both technical factors and human factors play roles in contributing to the ADM system’s transparency. In this sense, ADM systems are sociotechnical systems that “do not contain complexity but enact complexity by connecting to and intertwining with assemblages of humans and non-humans” (Ananny & Crawford, 2018, p. 974). Furthermore, social, cultural, and political factors also play a role in how human decisions fit into the algorithm’s workings and output (Gorwa, Binns, & Katzenbach, 2020) and ultimately dictate the net effects of transparency, including user perceptions.
As a particular case of a warm institutional factor, Burrell (2016) points out that in business applications, a common reason for algorithmic opacity is intentional corporate secrecy. A company may intentionally wish to keep its algorithms hidden and private to derive a durable advantage over its competitors and differentiate its brand by delivering unique benefits to customers; in fact, it may even take legal steps to do so by claiming its algorithms as intellectual property and trade secrets (Foss-Solbrekk, 2021). For example, Google frequently changes its PageRank algorithm, but both the algorithm itself and its changes are proprietary. Businesses looking to raise their ranking have no visibility into the actual inputs; in fact, marketing scholars have argued that algorithmic opacity is a significant reason for relatively little research on SEO marketing in academic journals (Dholakia, 2022).
It is noteworthy that hidden behind the cover of justified corporate secrecy, the algorithms used by the company could result in discrimination, unfair advantage, and other adverse outcomes, with targets being none the wiser. This idea is explored in more detail later on in the chapter. It also provides new insight into other forms of business transparency considered in previous chapters by suggesting that it may be more meaningful to view transparency as a property of a business setting that includes institutional and human factors rather than the property of an individual or group within the organization. As one example, this perspective advances the possibility that a business leader’s transparency as a personally held value could easily interact with an organization’s cultural transparency to amplify or attenuate the effects we expect from the individual forms. In sum, the idea of “warm institutional factors” urges us to break down the extant silos between the different areas of business transparency to get at the root of its significance and effects.
Event Versus Process Algorithmic Transparency
Furthermore, researchers have also identified two distinct forms of transparency that are relevant for assessing an ADM: transparency in the outcomes of the ADM’s output and transparency in the processes behind the algorithm’s development and use. As Ananny and Crawford (2018) put it, “transparency is thus not a precise end state where everything is clear and apparent, but a system of observing and knowing that promises a form of control.” (p. 975). Heald’s (2006) broad distinction between event and process transparency in governance provides some valuable insights. When the discloser adopts event transparency, it sheds light on the inputs, outputs, and outcomes, typically measured by proxy variables. Process transparency goes further with respect to understanding by providing information about the processes that convert the inputs to outputs and the linkage processes that lead from outputs to outcomes.
We can distinguish between these forms concretely by considering the algorithmic transparency of shadow credit scores used for tenant screening. In this case, the variables used to calculate the scores are the inputs. Their sources, the processes by which they are encoded, recorded, and used for analysis (Holland et al., 2018), the methods to ascertain their accuracy, how they are weighted in the algorithm, the nature of non-compensatory rules and potential biases that might arise from their application (e.g., is a criminal conviction grounds for rejecting the tenant regardless of their creditworthiness, or does a higher credit score compensate for a prior conviction? If a criminal conviction is treated as absolute grounds for rejection, will that harm certain groups disproportionately, and is that acceptable to the user?), the roles played by human agents vis-à-vis the technology in the final output and its use for decision making are all facets of the ADM system’s conversion processes.
The final shadow credit scores assigned to potential renters are the outputs. Their behaviors as renters over the lease period and afterward constitute the outcomes. Finally, the psychological or sociological processes through which the shadow credit scores explain the renter’s future behavior constitute the linkage process. Through this lens, consistent with the earlier discussion, perfect algorithmic transparency is hypothetical and idealized, whereby an external observer would have perfect insight into every one of these steps in the renter’s experience. Practically, how much insight observers can get into this entire process characterizes the algorithmic transparency of shadow credit score deployment.
Ribeiro, Singh, and Guestrin (2016) distinguish between “trusting a model” and “trusting its prediction,” which adds meaningfully to this discussion. Trusting a model relates to the belief that the model is correct and will yield good results when used. On the other hand, trusting its prediction has to do with whether the user trusts an individual prediction made by the model enough to act based on its output. These two types of trust affect different entities (the modeler and the decision maker, respectively) and require different actions to support them.
The Algorithmic Transparency Continuum
In generating a potential renter’s shadow credit score, a company may use not only financial variables that are typically used to generate a credit score but also other publicly available data that may only be tangential, if at all related, such as criminal records, speeding tickets, and data on previous evictions. Which specific variables are chosen, their provenance (i.e., reliability and accuracy), and how they are used in the algorithm are just as important as the specific AI method used (e.g., a sparse linear model) and its limitations, how the resulting score is interpreted, what thresholds are established to deem the renter as acceptable, and so on. The roles played by human actors such as data analysts and modelers in utilizing the technology and their skills, and in particular, how much autonomy they have and how they use it, all affect the ADM system’s transparency. This results in an algorithmic transparency continuum.
The level of transparency varies, inadvertently or by design. In some cases, transparency is the bare minimum, disclosing that an algorithm is being used to those affected by its output without additional information or contextualization. At the other end, in-depth information regarding each stage of the scoring process is provided, indicative of the maximum possible transparency. In analyzing algorithmic content moderation, Gorwa, Binns, and Katzenbach (2020) argue that while some models are “inherently interpretable” (Rai, 2020) because of a confluence of technical complexity, commercial secrecy, and subjectivity introduced by a wide range of subjective factors, many ADM systems are so complex they cannot be deciphered even by experts. Thus, the maximum possible transparency is far from perfect transparency.
Much like transparency in other business settings covered in this book, algorithmic transparency is virtually never defined in binary (i.e., present or absent) and rarely in one-dimensional (i.e., high or low transparency) terms. Instead, it is seen as a continuous multi-dimensional concept that considers various factors, each of which is defined in degree for a given ADM system. We will consider Creel’s (2020) perspective when we discuss process-based transparency in Chapter 7.
Gillespie (2014, 2020) also provides an insightful analysis to help understand multi-dimensional aspects of algorithmic transparency. According to him, first, the visibility into the underlying dataset on which the algorithm is run is crucial. The sources of data, the collection policies, and the data collection practices, such as explicit user consent and the methods of processing data, all affect algorithmic transparency.
Second, what Gillespie (2014) calls “cycles of anticipation” influence transparency in many business applications. The primary purpose of algorithms is to predict the behavior of a target, say customers, and act in specific ways at the correct time using this prediction, which relies on understanding the target as comprehensively as possible and collecting data about every possible characteristic. An infamous example is the use of detailed customer demographics and purchase behavior, such as unscented lotion and magnesium supplements, by the US-based retailer Target to generate a “pregnancy prediction” score that not only predicted whether a particular customer was pregnant but the likely due date for those who were pregnant. While the retailer’s goal was to use these predictions to send targeted promotions to customers, its unintended byproduct was widespread outrage at the intrusion of privacy (Duhigg, 2012).
The non-specific data trawling impedes algorithmic transparency by making it hard to justify or explain the use of certain variables (e.g., unscented lotion purchase) to predict outcomes (e.g., the customer’s pregnancy). At an abstract level, this approach falls in the same genre as the questionable research practice of “fishing for significance” that was widely used by social science researchers with its attendant ethical problems before the replication crisis awakening of the early 2010s (John, Loewenstein, & Prelec, 2012).
Third, the degree to which the criteria used by the algorithm in determining the relevance of different input variables can be explained is important. As algorithms utilizing artificial intelligence learning techniques like unsupervised learning are employed repeatedly, they become more attuned to contextual information. They evolve from using relatively few and clearly explainable variables to a larger, more diffuse set of variables where the connection to the outcome is unclear. As Gillespie (2014) points out, “‘relevant’ is a fluid and loaded judgment, as open to interpretation as some of the evaluative terms that media scholars have already unpacked, such as “newsworthy” or “popular.” As there is no independent metric for what are the most relevant search results for any given query, engineers must decide what results look “right” and tweak their algorithm to attain that result, or make changes based on evidence from their user” (p. 175). We can call this intuitive, unexplainable subjectivity, and it is detrimental to algorithmic transparency because it undermines understanding and trust.
The fourth issue is related to Gillespie’s (2014) conception of “entanglement with practice,” which consists of two separate ideas. One is that in line with the omnibus transparency definition we adopted in Chapter 1, the objectives for using the algorithm are rooted in corporate strategy and are often quite specific. Why did Target choose to predict which of its customers was pregnant and when they would give birth, and not some other event like when they would purchase a house or retire, which are commonly used by other retailers? It was because the marketing strategy of Target privileged new parents; the company’s senior executives were interested in pursuing this particular customer segment because they saw it as lucrative (Duhigg, 2012). Another retailer might have chosen different segments, events, triggers, and algorithmic inputs and outputs. Simply put, when the reason for an important decision has antecedents terminating at executive preference, formulating a coherent explanation has a natural, and often inadequate, backstop that ends with a manager’s or group’s judgment.
The second idea is that the results of using algorithms are accretive; when one project is successful, it reinforces and pushes the company in the direction of building on its success and further fine-tuning the algorithms that work. This results in more sophisticated, data-intensive, and potentially complex algorithms over time and in the adoption of a more nuanced and hidden organizational decision calculus. An initial rupture eventually grows into a chasm between what the modeler envisioned and how the outputs are used in practice. As this discussion should make apparent, algorithmic transparency issues are full of rich theoretical considerations that have only recently begun to be unpacked and understood.
Enhancing Algorithmic Transparency
To deal with some of the potential challenges to algorithmic transparency identified thus far, Green and Viljoen (2020) propose the intriguing solution that it is important for ADM system users to painstakingly identify the boundaries within which their tools can work effectively and stay within them. Specifically, they suggest that as artificial intelligence applications expand their scope, they are bound to be applied to problems and settings where they are unsuitable or where their net effects (after accounting for unintended consequences) are negative. They ascribe the lack of transparency to the dominant mode of problem-solving within computer science, which they call “algorithmic formalism,” which privileges objectivity, neutrality, internalism, and universalism. To increase transparency and mitigate the unintended negative consequences of ADM systems, Green and Viljoen (2020) propose replacing algorithmic formalism with “algorithmic realism,” marked by “a reflexive political consciousness, a porousness that recognizes the complexity and fluidity of the social world, and contextualism” (p. 20). This point is pursued further in Chapter 7.
The Right to an Explanation
Intriguingly, many legal experts argue that when algorithms are used primarily or solely in making decisions that materially impact consumers, those affected adversely have a “right to an explanation” for the reasoning that led to the unfavorable outcome. The European General Data Protection Regulation (GDPR), a set of comprehensive regulations covering the collection, storage, and use of personal information of consumers adopted by the European Parliament in 2016, further intensified the debate about the importance of the right to an explanation, with experts seeing it as “one tool for scrutinizing, challenging, and restraining algorithmic decision making” (Edwards & Veale, 2006, p.6). A heavily analyzed and debated section of the GDPR involves the individual’s right to an explanation, which is scaffolded around the contents of Articles 15 and 22.
Article 15 of the GDPR provides consumers with the rights not only to obtain confirmation about whether the organization is using their personal data but also such things as “the purposes of the processing,” “the categories of personal data concerned,” and “the existence of automated decision-making, including profiling, … and, meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject.”
Furthermore, Article 22 of the GDPR states that the consumer “shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her.” Companies using the consumer’s data “shall implement suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests, at least the right to obtain human intervention on the part of the controller, to express his or her point of view and to contest the decision” (Goodman & Flaxman, 2017).
As the stipulations make clear, being able to explain how the ADM system works and how it is being used to make individual decisions is key information that the business must be able to provide affected consumers (Edwards & Veale, 2018). These ideas, particularly the level of power afforded to consumers, have not yet fully carried over to the United States, but the topic of the “right to an explanation” is full of intriguing and potentially promising possibilities to provide more balance and fairness to algorithmic transparency applications, particularly in digital marketing settings.
Comprehensibility, Understandability, and Explainability
In exploring the nuances behind mechanisms to enhance algorithmic transparency, many researchers use several terms in the literature that are often overlapping and used interchangeably. Arrieta and colleagues (2020) provide a useful typology of terms and an explanation of the nuances of the specific meanings that help clarify algorithmic transparency in practice. The first term is comprehensibility, defined as the ability of a learning algorithm to represent its learned knowledge in a form that an expert human user can understand at a minimum and by lay people at a much higher level. Many researchers also use comprehensibility interchangeably with interpretability, which is seen as a measure of the model’s complexity (Guidotti et al., 2018).
Second, understandability or intelligibility refers to the property of the model by which a layperson can understand its functioning, specifically how the model works, without any further need to explain its internal structure or technical details behind the algorithm or methods used to process the data. Third, explainability refers to an active characteristic of a model, denoting the affordances that allow it to take actions or implement a procedure with the primary purpose of clarifying or detailing its internal functioning. Building on this notion of explainability, Arrieta et al. (2020) define explainable AI as “one that produces details or reasons to make its functioning clear or easy to understand.” (p. 85). Each of these specific mechanisms contributes to unwrapping and clarifying the algorithm’s operation to users and targets.
Post-hoc explainability. Another way to increase the transparency of ADM systems is to use post-hoc explainability methods for models that are not readily interpretable to begin with. A number of techniques are currently being used, with improvements constantly occurring in this active research space, each of which relies on using expository methods used by humans. These include text explanations that rely on a verbal description, using appropriate mathematical symbols to describe the functioning of the model, visual explanations that use visualization means like graphs, charts, and infographics to depict the model’s functioning, local explanations that segment the relatively more complex solution space and provide verbal or visual explanations of simpler subspaces, explanations by example that extract and describe selected data samples to construct vivid examples, e.g., of a prototypical sample member, explanation by simplification which relies on constructing simplified, cruder, and more tractable models that provide the gist of the original model to the observer, and feature relevance explanation methods that describe the inner functioning of a model by computing relevance scores for key input variables, thus allowing observers to understand the relative importance of the different inputs. Note that within a particular application context, multiple post-hoc explainability methods may be used concurrently to improve the understanding of the model to interested observers.
The benefits of explainable ADM systems. Explainable ADM systems have at least six significant benefits that support their use and make them more accountable to users and those affected by their outputs (Arrieta et al., 2020; Guidotti et al., 2018). First, explainable models engender user trust because they bolster users’ beliefs that the model will run as intended for a given problem. Second, they help to establish causality by making it easier to find relationships between variables for further tests to investigate causal linkages. Third, explainable models make it easier to establish the boundaries within which a model may be used, leading to more effective implementations. For example, a credit-scoring model that heavily weighs several years of historical financial behaviors may be unsuitable for scoring a teenager or new immigrant with insufficient or sparse prior information. Another benefit is porting the explainable model to other similarly structured problems is easier (Habermalz & Huysman, 2021).
Fourth, explainable models support extrapolation by providing more information to the decision-maker about using the model output to make effective decisions. Simply put, these models are concurrently more informative and more restrictive. Fifth, by giving visibility into the structural elements that make the model robust and stable, explainable models lend greater confidence to decision-makers to interpret and use the model’s output. They ease the burden on non-technical or non-expert users in interacting with them. Finally, somewhat optimistically, Arrieta and colleagues (2020) argue that explainability can be considered as the capacity to reach and even guarantee fairness in ADM systems. Users can get involved in improving the models to the extent that they can make sense of the machinery inside.
Customer Transparency
The discussion so far in this chapter has focused on the use of algorithms by businesses and how they can manage algorithmic transparency to maintain their relationships with those affected by the algorithms’ outputs. However, the consideration of algorithmic transparency also raises a host of issues that are squarely focused on the discloser’s (i.e., the target’s) side of the information exchange, most of which fall within the domain of consumer behavior because they involve customers.
A particularly significant form of relevant transparency with multifaceted and nuanced impacts is customer disclosure, defined here as the personal information customers choose to reveal, are required to reveal, or inadvertently reveal about themselves to the business, the exchange partners on a platform, other customers, or (known and unknown) third parties throughout their buying and consumption journeys. Customer disclosure is a double-edged sword. On the one hand, the more the marketer knows about the customer, the better they will be able to serve them, for example, by employing more nuanced and explainable algorithms. This so-called principle of customer orientation, which is to understand the customer and then design and deliver customized, compelling offerings to suit their preferences and tastes, lies at the heart of marketing strategy (Day, 1999) and has been robustly shown to affect customer well-being using a slew of measures (Arora et al., 2008). Indeed, with the advent of digital technology, many ethical marketers have embraced the concept of personalizing offerings for individual customers to increase the delivered value (Wind & Rangaswamy, 2001). However, there’s a significant downside for customers as well; revealing personal information creates the potential for them to be exploited and discriminated against by the marketer, other customers, and third parties that access the customer’s information.
Nowhere is the potential harm that disclosure can cause customers more clearly evident than in the seminal study of orchestra audition policies (Goldin & Rouse, 2000). When orchestras switched during the 1970s and 1980s from open auditions, where hiring juries could see the candidate, to blind auditions, where musicians were hidden behind screens during the audition, the percentage of female musicians hired by orchestras increased substantially, with as much as 55% of the increase attributed to the reduced transparency during the hiring process. Simply stated, anonymity reduced gender discrimination in hiring musicians. Disclosing customer information is analogous to “seeing” musicians in this study; the more the personal information is disclosed, the more visible the customer becomes, and thus, the more susceptible they will be to overt or inadvertent biases.
Other influential research has found that job candidates with Black-sounding names are significantly less likely to receive callbacks for interviews when compared to candidates with White names across occupations and industries (Bertrand & Mullainathan, 2004). These are robust findings that have been replicated repeatedly and in different settings. In these and many other such examples, disclosure of personal information is a precursor to discrimination and exploitation. Thus, there is a clear economic danger to customers from revealing personal information to counterbalance or even outweigh any potential benefits of personalization. The tension between the benefits of transparency and the potential harm it can cause customers lies at the heart of studying customer disclosure. Marketing scholars refer to this as the privacy-personalization paradox (Aguirre et al., 2016).
Customer disclosure can encompass a wide range of variables and personal information, from observed behaviors such as browsing, viewing, and engagement behaviors online, purchases, reviews, recommendations, complaints, and cancellations to expressed variables provided through survey responses such as investment goals and risk tolerance, expectations, evaluations of performance and psychological traits, to inferred data such as preferences, demographic and psychographic descriptors, and behavioral tendencies inferred about the customer. We can also distinguish between active disclosure, where the customer knows they are sharing information about themselves, and passive disclosure, where the sharing happens incidentally without the customer’s awareness. Complicating this distinction between active and passive disclosure is that even when customers sign up with service providers and agree to their terms of service, which typically provide details about what personal information will be collected and how it will be used, many of them fail to read or comprehend the dense description (Kim, 2017).
Accordingly, the customer transparency concept is nuanced on multiple levels. In a particular setting, the structure of customer transparency can be regulated, negotiated, or designed; it can be a resource or a feature; it can be interpreted from the standpoint of the seller, the platform, other customers, or the customer’s standpoint; it can be deliberate, inadvertent, or completely unknown; and it can be a concession, attribute, activity, or antecedent in the analysis of a particular problem or issue. For instance, fair lending regulations require that US financial institutions make lending decisions without consideration of the applicant’s race, gender, sexual orientation, or religion (among other defining characteristics). In such settings, observed or inferred data about customers along any of these defining characteristics cannot be used to score or assess them by the business, even if it is available.
In contrast, on two-sided platforms such as eBay and Airbnb, consumers have a fair degree of autonomy regarding how much of their personal information to share; the decision involves a trade-off between the benefits of disclosure, such as crafting a curated identity to impress transaction partners and generate desired outcomes, and its costs such as enduring racial or other forms of discrimination. Platforms can also embed designed disclosure constraints into their structure, dictating how much information customers can disclose and when. For instance, many platforms require their customers to identify themselves only by their first names and allow pseudonyms to give them anonymity.
While the affordance of anonymity is useful, from the customer’s perspective, much like other forms of knowledge, understanding how much personal information they are disclosing and how sellers are using this information is often lacking or inaccurate. In other words, the same issues and dilemmas we saw in algorithmic transparency crop up for customer disclosure. During exchanges, where customers may deliberately and thoughtfully disclose certain information, a considerable amount may be inadvertent. Given the complexity of the nature of customer transparency, its implications play out in numerous consequential ways in business exchanges, from a greater likelihood of discrimination resulting from sharing one’s racial, sexual, or gender identity on platforms to the likelihood of getting funded by telling compelling stories or even just “appearing trustworthy” (Duarte, Siegel, & Young, 2012) on crowdfunding and peer-to-peer sites, and the counterintuitive beneficial effects of participating in shared medical appointments where customer transparency extends to peer disclosure. Next, we consider each of these research areas in further detail.
Racial Discrimination on Two-Sided Platforms
On Airbnb, short-term renters and hosts can provide personal information such as their name, a photograph, and a brief introduction. Although such transparency can be beneficial by humanizing a renter and reassuring potential hosts who may be nervous about letting strangers stay in their homes, it also has downsides. As one concrete example, research shows that disclosing one’s racial identity on the Airbnb platform, inferred by the host through the renter’s name, can lead to racial discrimination (Cui, Li, & Zhang, 2020; Edelman, Luca, & Svirsky, 2017). In one influential field experiment conducted on Airbnb by booking requests to approximately 6,400 rental listings, Edelman, Luca, and Svirsky (2017) found that when compared to booking requests from potential renters with distinctively white names like Anne Walsh and Brad Murphy, those with distinctively Black names like Tanisha Jackson and Tyrone Robinson were 16% less likely to be accepted (42% positive response versus 50%) even though every act of discrimination potentially cost the hosts between $65 and $100 in foregone revenue.
Furthermore, racial discrimination was widespread in the study; as the authors put it, “Our results are remarkably persistent. Both African American and white hosts discriminate against African American guests; both male and female hosts discriminate; both male and female African American guests are discriminated against. Effects persist both for hosts who offer an entire property and for hosts who share the property with guests. Discrimination persists among experienced hosts, including those with multiple properties and those with many reviews. Discrimination persists and is of similar magnitude in high- and low-priced units, in diverse and homogeneous neighborhoods.” (p. 3).
Other studies have found that Airbnb hosts are also discriminated against, as measured by the average prices realized by minority hosts relative to white hosts (Kakar et al., 2018); women hosts earned an average of 12% less revenue, and Black hosts earned an average of 22% less revenue in a large multinational study (Törnberg, 2022). In other words, discouraging as it sounds, these studies show when there is disclosure, it prompts discrimination by virtually everyone using the disclosed information for decision making.
Encouragingly, a recent study by Cui, Li, and Zhang (2020) found that when a positive review is posted on their Airbnb profile page, discrimination is reduced for potential renters. In their study, acceptance rates for white and black renters with at least one positive review were equivalent. Even negative reviews or ratings without evaluative content helped reduce the race-based acceptance difference. The implications of these findings for customer transparency in platform design are clear: managers may be best served by designing the platform for less participant or customer transparency as a whole and more controlled and purposive transparency, in particular, to achieve well-defined goals while still allowing participants the discretion to choose who to do business with (Fisman & Luca, 2016). The transparency principle articulated at the outset of this book needs to be dialed down.
There are many ways to moderate customer disclosure and encourage efficient and collegial transactions between renter and host. They may include concealing renter names and photographs until after the host confirms the booking, reducing the size of host photographs, and giving greater prominence to their experience and factual information about the rental. Additionally, renters and hosts could be encouraged to use pseudonyms instead of their real names, renters could be allowed to provide personal information about themselves beyond what they look like to things such as their hobbies and travel interests to reduce negative stereotyping, and members could be nudged by reminding them that they are subject to implicit biases. Hosts could be incentivized (or even mandated if there are complaints against them) to use the “Instant Book” feature whereby rental requests from any qualified renter are accepted automatically, and hosts and renters could be encouraged to post reviews after the stay. Note that in each of these examples, transparency is implemented in a controlled and purposive way to reduce discrimination, and the idea that “more transparency is better” is jettisoned.
The Benefits of Shared Medical Appointments
In the medical field, the concept of shared medical appointments, in which “patients receive one-on-one physician consultations in the presence of others with similar conditions” (Ramdas & Darzi, 2017), provides another intriguing setting to study a different form of customer transparency. Shared medical appointments have been used for various purposes, including patient education, physical exams, and treating chronic conditions requiring multiple visits. They are seen as a way to deliver medical care efficiently using available physician resources. Instead of affording complete privacy from a transparency perspective, the method employs designed co-interaction, resulting in the inadvertent disclosure of one’s medical information to peers.
Recent research suggests that greater patient transparency in shared medical appointments has unexpected benefits. Beyond the obvious efficiency benefits from the physician providing the core, common information once to the group instead of repeating it to each patient, patients also feel more engaged in this setting, spending more time with the physician than they would in a one-to-one setting, sharing camaraderie and solidarity with peers and learning from the physician’s interactions with them. Buell, Ramdas, and Sonmez (2021) conducted a randomized control trial with one thousand patients undergoing glaucoma treatment over three years at an Indian eye hospital and found that when they met with the physician in a group setting, the patients demonstrated greater engagement using a variety of behavioral measures like questions asked and answered, comments made, attentiveness, and head wobbling when compared to those who participated in one-on-one appointments. Clearly, the patient’s loss of privacy was more than counterbalanced by the learning and camaraderie in this setting. An obvious open question is whether these beneficial effects of greater customer transparency are culture-specific (e.g., Indians may be socially and culturally accustomed to less privacy than other cultures; Larson & Medora [1992]) and whether they carry over to, say, the United States, where patient privacy is the holy grail of medical practice.
Effects of Customer Transparency on Crowdfunding and Peer-to-Peer Lending
Financially oriented platforms where peers undertake economic exchanges offer powerful insights into the nuanced roles and double-edged nature of customer transparency. The purpose and parameters of the economic exchange vary depending on the platform. On crowdfunding platforms, the economic exchange relationship is that of creators and patrons: creators seek funding for commercial, social, or creative projects, and patrons may receive something tangible in exchange, such as a product, or donate money to support the creator. P2P lending platforms, in contrast, are alternatives to traditional investment vehicles. On them, individual borrowers seek funding from individual investors or lenders. By disintermediating financial institutions, exchange partners can receive better terms, a lower interest rate for the borrower, and a higher risk-adjusted return for the investor.
On both crowdfunding and P2P lending platforms, there is a significant tension between maintaining arm’s length and recipient anonymity and providing sufficient relevant information about the recipient for givers to feel comfortable about making financial contributions. Some disclosure by the receiver is essential, but too much may cause significant harm. Furthermore, the validity of the disclosure itself adds a layer of uncertainty and complexity to the giver’s decision making.
Research has found that when previously unacquainted consumers engage in economic exchanges such as borrowing and lending money, they need a way to engender trust; self-disclosure, for instance, by providing a coherent and seemingly authentic narrative about themselves is one way to do so (Sonenshein, Herzenstein, & Dholakia, 2011). On Prosper.com, for example, a natural thing for most borrowers to do in their loan listing was to explain why they need the money, along with making claims about their identities. Herzenstein, Sonenshein, and Dholakia (2011) studied the effects of such disclosure on borrowers’ ability to receive funding and repay the borrowed money two years later.
They uncovered six different identity claims in borrowers’ self-disclosures concerning their economic exchange relationship: trustworthy, successful, hardworking, economic hardship, moral, and religious. For instance, a trustworthy identity claim meant that the borrower’s narrative centered around conveying that the lenders could trust the borrower to pay back the money on time. In contrast, the successful identity claim forwarded the narrative that the borrower is someone with a successful business, job, or career. The authors found that both the number and content of identity claims disclosed in the narratives of borrowers on Prosper.com affected the lending decisions of borrowers. When borrowers disclosed more information about themselves by including more identity claims in their narrative, the loan funding, defined as the percentage of the requested amount given by lenders, went up. The content of the disclosure also mattered. Borrowers who disclosed narratives around their identities as being trustworthy or successful were associated with increased loan funding relative to those who composed and conveyed other identities like being moral or suffering economic hardship. Interestingly, later on, these borrowers were less likely to pay, suggesting that, unlike strategic herding, borrowers’ narratives were inauthentic and constructed opportunistically, leading to adverse influence on lenders.
Summary
These three examples of consumer disclosure settings underscore the key point surfacing throughout this chapter: that transparency involving the acquisition of private consumer information, on the one hand, and its use in algorithms, on the other hand, is rarely amenable to simplification, idealization, or generalization. Rarely is it the case in applications of algorithmic transparency and consumer disclosure that transparency can be defined clearly, provisions made to fully manage disclosure or its results, or entirely meaningful explanations and understanding can be generated. The key task, and accordingly, the highest-value scholarly contributions to the present and in the future, may lie mainly in identifying, explaining, and negotiating the boundaries of reasonable disclosure and finding unique, deliberately formulated, and mutually considered solutions that are highly contextualized, resistant to extrapolation or generalization.