Priority Sampling of Large Language Models for Compilers (2402.18734v1)
Abstract: LLMs show great potential in generating and optimizing code. Widely used sampling methods such as Nucleus Sampling increase the diversity of generation but often produce repeated samples for low temperatures and incoherent samples for high temperatures. Furthermore, the temperature coefficient has to be tuned for each task, limiting its usability. We present Priority Sampling, a simple and deterministic sampling technique that produces unique samples ordered by the model's confidence. Each new sample expands the unexpanded token with the highest probability in the augmented search tree. Additionally, Priority Sampling supports generation based on regular expression that provides a controllable and structured exploration process. Priority Sampling outperforms Nucleus Sampling for any number of samples, boosting the performance of the original model from 2.87% to 5% improvement over -Oz. Moreover, it outperforms the autotuner used for the generation of labels for the training of the original model in just 30 samples.
- Fixing hardware security bugs with large language models. arXiv preprint arXiv:2302.01215, 2023.
- SantaCoder: don’t reach for the stars! arXiv:2301.03988, 2023.
- Learning C to x86 Translation: An Experiment in Neural Compilation. arXiv:2108.07639, 2021.
- Mirostat: A neural text decoding algorithm that directly controls perplexity. arXiv preprint arXiv:2007.14966, 2020.
- Evaluating Large Language Models Trained on Code. arXiv:2107.03374, 2021.
- Kyunghyun Cho. Noisy parallel approximate decoding for conditional recurrent language model. arXiv preprint arXiv:1605.03835, 2016.
- Large language models for compiler optimization. arXiv preprint arXiv:2309.07062, 2023.
- Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models. In ISSTA, 2023.
- Hierarchical neural story generation. arXiv preprint arXiv:1805.04833, 2018.
- Emil Julius Gumbel. Statistical theory of extreme valuse and some practical applications. Nat. Bur. Standards Appl. Math. Ser. 33, 1954.
- The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751, 2019.
- Stochastic beams and where to find them: The gumbel-top-k trick for sampling sequences without replacement. In International Conference on Machine Learning, pages 3499–3508. PMLR, 2019.
- Unsupervised Translation of Programming Languages. arXiv:2006.03511, 2020.
- Implicit unlikelihood training: Improving neural text generation with reinforcement learning. arXiv preprint arXiv:2101.04229, 2021.
- On sampling top-k recommendation evaluation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2114–2124, 2020.
- StarCoder: may the source be with you! arXiv:2305.06161, 2023.
- Competition-level code generation with AlphaCode. Science, 378(6624), 2022.
- Determinantal beam search. arXiv preprint arXiv:2106.07400, 2021.
- Typical decoding for natural language generation. arXiv preprint arXiv:2202.00666, 2022.
- OpenAI. GPT-4 Technical Report. arXiv:2303.08774, 2023.
- Conformal nucleus sampling. arXiv preprint arXiv:2305.02633, 2023.
- Code Llama: Open Foundation Models for Code. arXiv:2308.12950, 2023.
- Adaptive Test Generation Using a Large Language Model. arXiv:2302.06527, 2023.
- Incremental sampling without replacement for sequence models. In International Conference on Machine Learning, pages 8785–8795. PMLR, 2020.
- Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424, 2016.
- Arithmetic sampling: parallel diverse decoding for large language models. In International Conference on Machine Learning, pages 35120–35136. PMLR, 2023.
- Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319, 2019.
- Efficient guided generation for large language models. arXiv e-prints, pages arXiv–2307, 2023.
- Automated program repair in the era of large pre-trained language models. In Proceedings of the 45th International Conference on Software Engineering (ICSE 2023). Association for Computing Machinery, 2023.
- Self-evaluation guided beam search for reasoning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- A survey of controllable text generation using transformer-based pre-trained language models. ACM Computing Surveys, 56(3):1–37, 2023.
- Dejan Grubisic (6 papers)
- Chris Cummins (23 papers)
- Volker Seeker (6 papers)
- Hugh Leather (23 papers)