WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization
Abstract: LLMs support data analysis through conversational user interfaces, as exemplified in OpenAI’s ChatGPT (formally known as Advanced Data Analysis or Code Interpreter). Essentially, LLMs produce code for accomplishing diverse analysis tasks. However, presenting raw code can obscure the logic and hinder user verification. To empower users with enhanced comprehension and augmented control over analysis conducted by LLMs, we propose a novel approach to transform LLM-generated code into an interactive visual representation. In the approach, users are provided with a clear, step-by-step visualization of the LLM-generated code in real time, allowing them to understand, verify, and modify individual data operations in the analysis. Our design decisions are informed by a formative study (N=8) probing into user practice and challenges. We further developed a prototype named WaitGPT and conducted a user study (N=12) to evaluate its usability and effectiveness. The findings from the user study reveal that WaitGPT facilitates monitoring and steering of data analysis performed by LLMs, enabling participants to enhance error detection and increase their overall confidence in the results.
Keywords : Animation, Conversational Data Analysis, Human-AI Interaction, Generative AI (GenAI), Code Verification
Motivation
If you haven’t tried data analysis with ChatGPT or other conversational tools like Gemini, go ahead and give it a try! You will be impressed by how helpful they are for simple tasks 🥰. However, we do observe some issues in the current design. According to our interview study (N=8), waiting for the response can be daunting and you may be idling. Besides, verifying raw code is mentally taxing. When you find out the error, refining it concerns re-generating the entire converation thread, which is not efficient. We are thus motivated to explore an alternative paradigm to interact with these LLM-powered conversational tools so that users take a higher agency in the process.
“Wait…GPT!”
The name of our system–“WaitGPT”–is a play on words of the phrase “Wait…what?” and the GPT model. As it suggests, we would like to make users more proactive when interacting with a data analysis assistant. WaitGPT translates the streaming code generated by LLms into visual primitives and provide users with a clear, step-by-step illustration. In additionn, the diagram is interactable. In addition to examine intermediate results, users can interact at the data operation level to tweak the code and later re-execute the code for a more refined analysis.
Case: Sorting Error
This video demonstrates a case when users discern an issue and use WaitGPT to correct the error.
Here, the sorting fails. By looking into the intermediate data table variable, one may see that it is caused by the string-based data type of the salary column. With WaitGPT, users can request an update on the generated code using natural language. Once satisfied with the local update, they can proceed to re-execute the code snippet and let the LLM agent to continue the analysis based on the corrected execution result.
Appendix
- Implementation Details: LLM prompts
- Comparative User Study: WaitGPT vs. WaitGPT without diagram
- Task A: Screenshot |
Employee
Dataset | Prompt source from emailing authors of Text2Analysis - Task B: Screenshot |
Flight
Dataset | Prompt source from the Arcade Dataset
- Task A: Screenshot |
Related Works
- Improving Steering and Verification in AI-Assisted Data Analysis with Interactive Task Decomposition. UIST 2024.
- How Do Analysts Understand and Verify AI-Assisted Data Analyses? CHI 2024.
- Conversational Challenges in AI-Powered Data Science: Obstacles, Needs, and Design Opportunities. arXiv 2023.
Citation
Liwenhan Xie, Chengbo Zheng, Haijun Xia, Huamin Qu, and Chen Zhu-Tian. 2024. WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST’24). Article No. 119. 14 pages. ACM, New York, NY, USA. DOI: 10.1145/3654777.3676374
Bibtex
@InProceedings {xie2024waitgpt,
title = {WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization},
author = {Liwenhan Xie and Chengbo Zheng and Haijun Xia and Huamin Qu and Chen Zhu-Tian},
booktitle = {Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology},
year = {2024},
numpages = {14},
articleno = {119},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3654777.3676374},
doi = {10.1145/3654777.3676374},
location = {Pittsburgh, PA, USA},
series = {UIST '24}
}
🤔People turn to LLMs for convinient intent expression and their flexibilty for non-standard personal tasks. Moving beyond data analysis, future works may explore how LLMs can support more personalized, multi-modal interactions that cater to diverse user needs and contexts.
— Liwenhan Xie (@LiwenhanXie) October 14, 2024