3.1 Suggested attention points for marketing managers
While marketing managers often used a formal model of consumers’ minds to support decision-making in traditional online purchase environments, we suggest that the complexity of the AIVA interaction process requires more extensive formalization of individuals’ mental models than in traditional online systems. We propose two dimensions by which this complexity is introduced. First, following proposition 2, voice-based assistance will likely benefit from models that allow for a more
abstract representation of options when matching them to individuals’ needs and wording in decision dialogs. While traditional recommendation algorithms typically represent options in terms of their (tangible) attributes, individuals in voice-based dialogs may use more abstract benefits to express their needs. Thus, models that connect attributes to abstract benefits are particularly relevant for AIVAs (Arentze et al.
2015). For example, if the AIVA is aware of a consumer’s (abstract) desire to lose weight, it may highlight caloric attributes of food options. Similarly, AIVAs can benefit from an ability to relate to a person’s
emotional state and empathize with the user. Interactive systems that can recognize and express emotions as well as be able to recognize, interpret, and act upon social signals are likely to be more successful (Yalcin and DiPaola
2018). Thus, marketing managers can pay attention to AIVAs’ ability to model the consumer’s mind in terms of capturing the consumer’s (a) needs-based representation of reality, (b) dynamic relationship with reality, and (c) emotional state.
Second, based on proposition 6, we expect that due to the inherently sequential nature of voice-based interactions, models used by AIVAs will benefit from being able to encode dynamic aspects of a consumer’s representation of a decision problem. The AIVA can track how the individual’s current state relates to previous interactions. Individuals may forget certain information over time, and an AIVA may not assume that the individual still knows previously provided information. It is also beneficial to capture how the current state relates to the future. For example, individuals may have goals that they wish to achieve, and a voice-based assistant will need to be able to monitor progress toward these (future) goals when providing (current) advice.
Thus, for AIVAs to interact with users in a naturalistic, voice-driven manner, the dialog system will likely need to not only understand a diverse set of inputs but also to respond in a similar fashion, relying on the same language terminology as the user. Current human language systems can continue user-initiated dialogs while accounting for path dependence. To be more assistive, the conversation could also help the user achieve the purpose of the interaction. In other words, output from AIVAs may benefit from accurately translating formal models into the consumer’s language.
Other recommendations for managers follow from propositions 1 and 3. In particular, we suggest that at each point in the dialog, a system can choose from a wide range of responses to help the user achieve his or her goals. A system should understand the user’s mental model, the short-term goal for the current interaction, and long-term goals and general interests. This requires access to a variety of personal data and integration into a broader system of purchase and behavior records. Depending on the user’s mental state, different continuances of the dialog will be more or less effective. A key component missing from most current AIVAs is the ability to give purpose to an extended dialog, such that the dialog moves in the desired direction. Importantly, such models need to choose the response that will most effectively move the dialog forward, such as asking for important information that is missing (“pull” dialog) or providing information to the user that s/he is missing (“push” dialog). This approach not only applies to objective knowledge but also to subjective aspects of the dialog, where a message like “This looks good!” can provide reassurance to the user.
3.2 Suggested attention points for policymakers
The proposed effects of using AIVAs for consumer decisions also raise several policy-related points of attention. With respect to consumer adoption of AIVAs, the White House Task Force on Smart Disclosure has noted several potential benefits of electronic information for consumers, including the opportunity for improved decision-making when using choice engines fueled by these data. This is especially relevant for decisions where consumers have a low need for autonomy, consistent with proposition 1. Personalized choice engines may be especially valuable for such low autonomy decisions, and their potential have been recognized by policymakers such as in the Dodd-Frank Wall Street Reform and Consumer Protection Act of 2010.
However, since consumers may be more susceptible to seller influence (proposition 4), active regulatory requirements may become more necessary. One way to mitigate the risks that AIVAs pose for consumers is for consumer-facing firms to adopt machine-readable disclosures. For example, financial services firms can be required by the Consumer Financial Protection Bureau (CFPB) to provide consumer information in a standardized, machine-readable form. Machine-readable disclosure information could manifest through an AIVA’s tailored and personalized sorting of information to an individual consumer. Mandatory disclosure requirements include information that companies do not have an incentive to produce or reveal; similarly, companies may not have an incentive to provide this information in a form that AIVAs can use without regulation. However, requiring machine-readable information does not guarantee that AIVAs (or consumers) will use the information (Loewenstein et al.
2011). Disclosure requirements are only meaningful if the information is integrated with an AIVA’s recommendation system.
In addition, the risks posed to consumers are particularly acute in verbal decision environments because the choice set is likely to be more limited than in decision environments where consumers view written information (i.e., proposition 5). Regulators could monitor whether AIVA companies voluntarily choose to incorporate disclosure information in their algorithms. Where gaps exist, regulators may need to consider extending the responsibility for providing disclosure information to the AIVA parent companies themselves.
Beyond regulation, proposition 4 further suggests a role for contract law in governing AIVAs and other digital assistants. Contract law offers no significant promise of protecting consumers from assistants that are biased against them. Creating an enforceable contract requires only that an assistant accurately discloses the nature of the service that it is providing, and on whose behalf, with no limit on the length of the disclosure (which can create the familiar “click through” disclosure that almost no one reads). Moreover, contract law offers only weak and difficult to access remedies for even highly abusive practices that are consistent with the terms of the disclosed contract. Consumer protection law offers somewhat more protective standards, prohibiting “unfair and deceptive practices,” and potentially more powerful statutory remedies, but practical enforcement of those standards typically requires a government consumer protection agency, such as the Federal Trade Commission or a state attorney general, to take action (Sawchak and Shelton
2017).
One promising approach to AIVAs could be to require that they function as the agent (with a fiduciary obligation) for the consumer they are assisting. As a fiduciary, an AIVA would have a legal obligation to place the interests of the consumer first, ahead of the interests of the company that provided the AIVA or the company’s contracting partners. For example, a fiduciary could not rank an insurance product higher based on the benefit that the insurance firm provides to the AIVA company. Because AIVAs can be manufactured to keep a record of the programs according to which they operate and the actions they take, those operations and actions can be audited, ideally on an automated basis, to verify that the assistant is complying with the standard (Baker and Dellaert
2018). When the AIVA is the service provider’s agent, not the agent of the consumer, consumers are at risk of exploitation when they use the assistant to make important decisions.
As suggested by propositions 2 and 5, social interaction with AIVAs opens another domain that may require consumer protection. As in many other forms of two-way digital interactions, consumer “inputs” to AIVAs are expected to be coded as data for use by parent firms. This audio data is likely already being used in aggregate to train and/or refine decision algorithms, just as Google uses queries to refine their search engine. However, voice can reveal significant personal identifying information. If used to develop the algorithms’ personalization or individual-level targeting, it could drive inequality and discrimination via differences in the information provided across consumer segments.
Vocal speech includes linguistic information (syntax, semantics, pragmatics), prosody, personal noises (coughs), and other auditory information (pitch, tone, volume, and rate), which can reveal demographic traits about the speaker such as gender (Schuller et al.
2013). Race and/or ethnicity can be decoded from dialects, accents, and pragmatics. Word choice can also signal social or economic strata and the use of specific slang or jargon offers further potential to refine consumer-relevant subgroup membership. Emotional signals could be used by firms to determine which products to offer during vulnerable moments.
3 Finally, voice data offer clues to age- or even health-related states which can reveal decision-relevant vulnerabilities (Giddens et al.
2013).
These voice-inferred demographics may create opportunities for exploitation. Active use of voice data, as in “dark patterns,” creates decision contexts based on the firm’s preferred outcomes in ways that can act against the individual’s best interests (Mathur et al.
2019). These concerns reflect the tension between personalization and privacy common in many types of digital services. In addition, the ability to do this type of identification may not be obvious or known to the consumer. Since discrimination may occur via omission—absent options or missing information—it is more challenging for affected individuals to identify that it is taking place, and for regulatory agencies to monitor it. Overall, there is a need for policy and/or legal frameworks to address how voice interaction data is captured and used in training AIVAs. These frameworks should address fairness (equity), privacy, data collection, and transparency.