Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Wordflow: Social Prompt Engineering for Large Language Models (2401.14447v1)

Published 25 Jan 2024 in cs.HC, cs.AI, cs.CL, and cs.LG

Abstract: LLMs require well-crafted prompts for effective use. Prompt engineering, the process of designing prompts, is challenging, particularly for non-experts who are less familiar with AI technologies. While researchers have proposed techniques and tools to assist LLM users in prompt design, these works primarily target AI application developers rather than non-experts. To address this research gap, we propose social prompt engineering, a novel paradigm that leverages social computing techniques to facilitate collaborative prompt design. To investigate social prompt engineering, we introduce Wordflow, an open-source and social text editor that enables everyday users to easily create, run, share, and discover LLM prompts. Additionally, by leveraging modern web technologies, Wordflow allows users to run LLMs locally and privately in their browsers. Two usage scenarios highlight how social prompt engineering and our tool can enhance laypeople's interaction with LLMs. Wordflow is publicly accessible at https://poloclub.github.io/wordflow.

"Wordflow: Social Prompt Engineering for LLMs" introduces a novel approach called social prompt engineering, aimed at making the process of designing prompts for LLMs accessible to non-experts. Prompt engineering is critical for eliciting the desired responses from LLMs, but it is often seen as complex and esoteric, necessitating a deep understanding of AI technologies. This paper addresses this problem by harnessing social computing techniques to facilitate collaborative prompt design.

Wordflow, an open-source social text editor developed as part of this research, serves as a practical tool that embodies this new paradigm. The platform is designed with the idea that laypeople should be able to easily create, test, share, and discover prompts for LLMs. Wordflow incorporates several key features that make it accessible and user-friendly:

  1. Collaborative Environment: Wordflow employs social computing to enable users to work together on prompt design. This collaborative aspect is essential for harnessing collective intelligence and democratizing the process of prompt engineering.
  2. Ease of Use: By leveraging a familiar text editing interface, Wordflow lowers the entry barrier for non-experts. Users can intuitively interact with the application without needing a deep understanding of the underlying AI technologies.
  3. Local Execution: One of the standout features of Wordflow is its ability to run LLMs locally and privately within the user's web browser. This is particularly significant as it addresses privacy concerns and eliminates the need for continuous internet connectivity, making the tool more versatile.

The paper also highlights two specific usage scenarios to demonstrate the efficacy of social prompt engineering and the Wordflow tool:

  • Scenario 1: Everyday users collaboratively designing prompts for a common task, such as drafting a business proposal. This scenario underscores how shared knowledge and diverse perspectives can lead to more effective prompts.
  • Scenario 2: Educators and students using Wordflow as an educational tool to explore and understand how LLMs interpret different prompts. This scenario illustrates the tool’s potential for enhancing learning and engagement in educational settings.

Ultimately, "Wordflow: Social Prompt Engineering for LLMs" makes a compelling case for leveraging social computing to democratize the process of prompt engineering. By making this process accessible to non-experts, it broadens the potential user base for LLMs and fosters innovative applications across various domains. Wordflow stands out not only for its technical capabilities but also for its focus on inclusivity and usability in AI interactions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. Phi-2: The Surprising Power of Small Language Models. (2023). https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/
  2. Amazon. 2023a. Amazon API Gateway: API Management. https://aws.amazon.com/api-gateway/
  3. Amazon. 2023b. Amazon DynamoDB: Fast NoSQL Key-Value Database. https://aws.amazon.com/dynamodb/
  4. Amazon. 2023c. PartyRock: Everyone Can Build AI Apps. https://partyrock.aws/
  5. Anthropic. 2023. Introduction to Prompt Design. https://docs.anthropic.com/claude/docs/introduction-to-prompt-design
  6. Apple. 2023. Use Safari Web Apps on Mac. https://support.apple.com/en-us/104996
  7. ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing. arXiv 2309.09128 (2023). http://arxiv.org/abs/2309.09128
  8. A General Language Assistant as a Laboratory for Alignment. arXiv 2112.00861 (2021). http://arxiv.org/abs/2112.00861
  9. PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. https://doi.org/10.18653/v1/2022.acl-demo.9
  10. Tuba Bakici. 2020. Comparison of Crowdsourcing Platforms from Social-Psychological and Motivational Perspectives. International Journal of Information Management 54 (2020). https://doi.org/10.1016/j.ijinfomgt.2020.102121
  11. On the Opportunities and Risks of Foundation Models. arXiv 2108.07258 (2022). http://arxiv.org/abs/2108.07258
  12. Language Models Are Few-Shot Learners. In Advances in Neural Information Processing Systems, Vol. 33. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  13. Harrison Chase. 2022. LangChain: Building Applications with LLMs through Composability. https://github.com/langchain-ai/langchain
  14. ChatX. 2023. ChatX: ChatGPT, DALL⋅⋅\cdot⋅E & Stable Diffusion Prompt Marketplace. https://chatx.ai/
  15. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). https://www.usenix.org/conference/osdi18/presentation/chen
  16. Prompt Sapper: A LLM-Empowered Production Tool for Building AI Chains. arXiv 2306.12028 (2023). http://arxiv.org/abs/2306.12028
  17. Scaling Instruction-Finetuned Language Models. arXiv 2210.11416 (2022). http://arxiv.org/abs/2210.11416
  18. Dom Eccleston and Steven Tey. 2022. ShareGPT: Share Your Wildest ChatGPT Conversations with One Click. https://sharegpt.com
  19. CoPrompt: Supporting Prompt Sharing and Referring in Collaborative Natural Language Programming. arXiv 2310.09235 (2023). http://arxiv.org/abs/2310.09235
  20. Programming without a Programming Language: Challenges and Opportunities for Designing Developer Tools for Prompt Programming. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3544549.3585737
  21. Neil Fraser. 2012. Diff-Match-Patch: Hgh-performance Library in Multiple Languages That Manipulates Plain Text. https://github.com/google/diff-match-patch
  22. Perceptions of Virtual Reward Systems in Crowdsourcing Games. Computers in Human Behavior 70 (2017). https://doi.org/10.1016/j.chb.2017.01.006
  23. Google. 2020. Lit: Simple Fast Web Components. https://lit.dev/
  24. Google. 2022. Add and Open Chrome Apps. https://support.google.com/chrome_webstore/answer/3060053?hl=en
  25. Google. 2023. Google AI Studio. https://makersuite.google.com/app/prompts/new_freeform
  26. Hans W. A. Hanley and Zakir Durumeric. 2023. Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites. arXiv 2305.09820 (2023). http://arxiv.org/abs/2305.09820
  27. Drew Harwell. 2023. Tech’s Hottest New Job: AI Whisperer. No Coding Required. https://www.washingtonpost.com/technology/2023/02/25/prompt-engineers-techs-next-big-job/
  28. Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning Experiences. arXiv 2310.04621 (2023). http://arxiv.org/abs/2310.04621
  29. David Holz. 2022. Midjourney: Exploring New Mediums of Thought and Expanding the Imaginative Powers of the Human Species. https://www.midjourney.com
  30. PromptMaker: Prompt-based Prototyping with Large Language Models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. https://doi.org/10.1145/3491101.3503564
  31. Large Language Models Are Zero-Shot Reasoners. Advances in Neural Information Processing Systems 35 (2022). https://proceedings.neurips.cc/paper_files/paper/2022/hash/8bb0d291acd4acf06ef112099c16f326-Abstract-Conference.html
  32. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv 2005.11401 (2021). http://arxiv.org/abs/2005.11401
  33. Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. Comput. Surveys 55 (2023). https://doi.org/10.1145/3560815
  34. Comparing the Structures and Characteristics of Different Game Social Networks - The Steam Case. In 2021 IEEE Conference on Games (CoG). https://doi.org/10.1109/CoG52621.2021.9619130
  35. Guidance: A Guidance Language for Controlling Large Language Models. guidance-ai. https://github.com/guidance-ai/guidance
  36. MDN. 2021. Web Components - Web APIs. https://developer.mozilla.org/en-US/docs/Web/API/Web_components
  37. MDN. 2023. WebGPU API - Web APIs. https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API
  38. PromptAid: Prompt Exploration, Perturbation, Testing and Iteration Using Visual Analytics for Large Language Models. arXiv 2304.01964 (2023). http://arxiv.org/abs/2304.01964
  39. Eugene W Myers. 1986. An O (ND) Difference Algorithm and Its Variations. Algorithmica. An International Journal in Computer Science 1 (1986).
  40. Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine. arXiv 2311.16452 (2023). http://arxiv.org/abs/2311.16452
  41. OpenAI. 2023a. GPT-4 Technical Report. arXiv 2303.08774 (2023). http://arxiv.org/abs/2303.08774
  42. OpenAI. 2023b. OpenAI Playground. https://platform.openai.com/playground
  43. Jonas Oppenlaender. 2022. A Taxonomy of Prompt Modifiers for Text-To-Image Generation. arXiv 2204.13988 (2022). http://arxiv.org/abs/2204.13988
  44. Training Language Models to Follow Instructions with Human Feedback. arXiv 2203.02155 (2022). http://arxiv.org/abs/2203.02155
  45. Revisiting Prompt Engineering via Declarative Crowdsourcing. arXiv 2308.03854 (2023). http://arxiv.org/abs/2308.03854
  46. PromptBase. 2023. PromptBase: Prompt Marketplace: Midjourney, ChatGPT, DALL⋅⋅\cdot⋅E, Stable Diffusion & More. https://promptbase.com
  47. PromptHero. 2023. PromptHero: Search Prompts for Stable Diffusion, ChatGPT & Midjourney. https://prompthero.com/
  48. Promptstacks. 2023. Promptstacks: Your Prompt Engineering Community. https://www.promptstacks.com/
  49. Reddit. 2023. R/ChatGPTPromptGenius. https://www.reddit.com/r/ChatGPTPromptGenius/
  50. Item-Based Collaborative Filtering Recommendation Algorithms. In Proceedings of the 10th International Conference on World Wide Web. https://doi.org/10.1145/371920.372071
  51. Exploring Gamers’ Crowdsourcing Engagement in Pokémon Go Communities. The TQM Journal (2021). https://doi.org/10.1108/TQM-05-2021-0131
  52. Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models. arXiv preprint arXiv:2206.04615 (2022). http://arxiv.org/abs/2206.04615
  53. Learning to Summarize from Human Feedback. arXiv:2009.01325 [cs] (2020). http://arxiv.org/abs/2009.01325
  54. Hendrik Strobelt. 2023. Prompt Tester - Quick Prompt Iterations for Ad-Hoc Tasks. https://prompt-tester.vizhub.ai/blog
  55. Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation With Large Language Models. IEEE Transactions on Visualization and Computer Graphics (2022). https://doi.org/10.1109/TVCG.2022.3209479
  56. Prompterator: Iterate Efficiently towards More Effective Prompts. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. https://doi.org/10.18653/v1/2023.emnlp-demo.43
  57. Gemini: A Family of Highly Capable Multimodal Models. arXiv preprint arXiv:2312.11805 (2023). https://arxiv.org/abs/2312.11805
  58. MLC team. 2023. MLC-LLM. https://github.com/mlc-ai/mlc-llm
  59. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv 2307.09288 (2023).
  60. DiffusionDB: A Large-Scale Prompt Gallery Dataset for Text-to-Image Generative Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://aclanthology.org/2023.acl-long.51
  61. Finetuned Language Models Are Zero-Shot Learners. arXiv 2109.01652 (2022). http://arxiv.org/abs/2109.01652
  62. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems, Vol. 35. https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
  63. Sociotechnical Safety Evaluation of Generative AI Systems. arXiv 2310.11986 (2023). http://arxiv.org/abs/2310.11986
  64. Brandon T. Willard and Rémi Louf. 2023. Efficient Guided Generation for Large Language Models. arXiv 2307.09702 (2023). http://arxiv.org/abs/2307.09702
  65. Stable Diffusion Breaks the Internet. https://changelog.com/podcast/506
  66. Max Woolf. 2023. The Problem With LangChain. https://minimaxir.com/2023/07/langchain-problem/
  67. PromptChainer: Chaining Large Language Model Prompts through Visual Programming. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. https://doi.org/10.1145/3491101.3519729
  68. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3491102.3517582
  69. Wei Wu and Xiang Gong. 2020. Motivation and Sustained Participation in the Online Crowdsourcing Community: The Moderating Role of Community Commitment. Internet Research 31 (2020). https://doi.org/10.1108/INTR-01-2020-0008
  70. Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3544548.3581388
  71. TinyLlama: An Open-Source Small Language Model. arXiv 2401.02385 (2024). http://arxiv.org/abs/2401.02385
  72. InstructPipe: Building Visual Programming Pipelines with Human Instructions. arXiv 2312.09672 (2023). http://arxiv.org/abs/2312.09672
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zijie J. Wang (39 papers)
  2. Aishwarya Chakravarthy (4 papers)
  3. David Munechika (6 papers)
  4. Duen Horng Chau (109 papers)
Citations (8)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets