The Importance of User Experience in AI Systems And How User Experience Was Key To ChatGPT Success
As artificial intelligence (AI) becomes more prevalent in our daily lives, it is important to define what is the place of user experience in AI system development. Whether it’s a virtual assistant on our phone or a chatbot on a website, we want AI systems that understand and respond to us naturally and effectively.
In fact, chatbots and virtual assistants are simple examples of a basic premise; the user experience is a prerequisite for developing high-quality software, which also applies to AI systems.
According to ISO/IEC 25,000 series on systems and software quality requirements and evaluation (SQuaRE), “satisfaction” is a characteristic of the quality in use. In other words, end-user satisfaction should always be taken into consideration so that developed systems are used correctly and adopted. Satisfaction can be broken down into the following sub-characteristics:
· Usefulness: Are the end-users satisfied with the perceived achievement of their goals?
· Comfort: Is the system meeting the end-users needs?
· Trust: Are the end-users or stakeholders confident with the capabilities of the system?
· User experience: Are the end-users having a good experience with the system?
I genuinely believe that most AI development teams are primarily working on building very performant AI systems. However, this quest for high performance might come at a high cost if done at the expense of end-user satisfaction.
GPT-3 vs. ChatGPT
GPT-3 (Generative Pre-training Transformer 3) is a language model developed by OpenAI in 2020 that has been widely used for various natural language processing tasks. It is one of the largest and most powerful language models currently available, with a capacity of 175 billion parameters.
Although this system is very powerful, usually generating impressive and realistic content, it is simply not designed with user experience in mind. In fact, user experience is very… subTo be specific, GPT-3 was not specifically developed to enhance user experience but as a general-purpose language model for various applications. Its main goal was to improve the performance of natural language processing tasks such as translation, summarization, and question-answering.
ChatGPT, like GPTinstruct, on the other hand, is a variants of GPT-3 that was developed to enhance user experience. How is ChatGPT able to incorporate user experience considerations? By interacting with end-users, allowing them to adapt and improve its performance over time. The image below represents ChatGPT’s high-level development methodology :
Incorporating user data into the training process allows ChatGPT to become more personalized and tailored to individual users’ specific needs and preferences. This leads to an increased overall user experience, as the model is better able to understand and respond to the user in a natural and effective way.
In summary, while GPT-3 was not developed to enhance user satisfaction, GPTinstruct and ChatGPT were explicitly designed with this objective in mind. ChatGPT’s ability to learn from users through the incorporation of user data allows it to improve and adapt over time, resulting in a more personalized and effective language model.
Why does user satisfaction matter?
The benefits of this approach are clear. First, a more personalized AI system can better understand and respond to the user, leading to a better overall experience. This is particularly important in applications such as customer service chatbots, where understanding and addressing the user’s specific needs is crucial.
In addition to improving the user experience, ChatGPT’s incorporation of user data also leads to a more effective system overall. By constantly learning from new data, ChatGPT is able to generate more accurate and relevant responses, leading to a higher level of satisfaction for the user.
How end-users interact with the AI system
The future is bright for AI systems as long as they are developed with user experience in mind. This means that a deep understanding of the AI-human relationship will be required to develop high-quality AI systems that will be adopted. Harvard Business Review has written a very resourceful article describing different collaboration modes between end-users and AI systems.
AI system assisting end-users.
There are processes that will remain led by humans in the future (despite what science fiction fans might tell you!). ChatGPT is an excellent example, as it creates content for humans which should be reviewed by end-users. Here are the capabilities that are key to develop such a tool successfully:
· Amplify: Assists end-users can amplify a wide variety of tasks in order to help further the end-users in their role. ChatGPT is very generalizable and can generate all sorts of content (outlines, texts, even code) to help end-users in various tasks. An automated AI solution would instead focus on a more performant but narrower output.
· Interact: Allow end-users to interact further with the system to extract the desired output they are the most comfortable with. This is essentially why ChatGPT has a dialogue capability. It enables end-users more control over the output.
· Embody: Generate content that is custom for each end-user. Again, this is why ChatGPT and GPTinstruct have started updating according to user inputs. Although this is not reaching a fully personalized mode yet, it is getting there.
End-user assisting automated AI systems.
Other types of AI solutions are built to be automated. For example, recommender engines on e-commerce websites, anomaly detection, automated driving or other types of problems require a higher level of automation. However, ChatGPT is not such a solution, as it require different capabilities to optimize its end-user experience:
· Training: Allow and facilitate re-training of the AI solution to maximize the system’s performance. Such a system should have interfaces to select and remove training data, model evaluation and model selection.
· Explaining: For any automated system, explainability is vital as it allows end-users to troubleshoot problems and estimate how decisions were made. It is required in some regulated industries and critical use cases. In regard to ChatGPT, this is not feasible. Even when you ask for references or reasons regarding its generated content, ChatGPT does not provide actual interpretability logs or reasons.
· Sustaining: An automated AI system should have mechanisms to monitor and correct problems for future iterations. Again, ChatGPT does not have such features.
It is very important to understand well the role of an AI system in designing the user experience properly. For example, MIT Technology Review documented multiple examples of GPT-3 being used automatically, which has generated bad, very bad and sometimes even catastrophic or illegal responses.
In conclusion
In conclusion, we have seen many AI algorithms, models and systems developed by numerous companies. Even some of the top systems like GPT-3 are not fully adopted, not because of performance reasons but because of a lack of consideration for user experience.
ChatGPT is very popular because OpenAI has decided to put user experience first. Moreover, OpenAI has made it clear that ChatGPT is meant to assist end-users in their content creation tasks. This is probably why it is such an adopted tool in a very short amount of time.
I hope that ChatGPT will lead a new era of user-centric AI systems, and I firmly believe that user experience is a prerequisite to deploying successful AI systems.
Sources
- Ouyang, Long, et al. Training language models to follow instructions with human feedback. arXiv, Mar 4th, 2022. arXiv.org, http://arxiv.org/abs/2203.02155.
- Brown, Tom B., et al. Language Models are Few-Shot Learners. arXiv, Jul 22nd, 2020. arXiv.org, https://doi.org/10.48550/arXiv.2005.14165.
- « ChatGPT: Optimizing Language Models for Dialogue ». OpenAI, Nov 30th, 2022, https://openai.com/blog/chatgpt/.
- « Why GPT-3 Is the Best and Worst of AI Right Now ». MIT Technology Review, Feb 24th, 2021, https://www.technologyreview.com/2021/02/24/1017797/gpt3-best-worst-ai-openai-natural-language/.
- « Collaborative Intelligence: Humans and AI Are Joining Forces ». Harvard Business Review, Jul 1st, 2018. hbr.org, https://hbr.org/2018/07/collaborative-intelligence-humans-and-ai-are-joining-forces.
- ISO/IEC 25022:2016 — Systems and software engineering — Systems and software quality requirements and evaluation (SQuaRE) — Measurement of quality in use. https://webstore.ansi.org/standards/iso/isoiec250222016.