Who is leading in AI? An analysis of industry AI research (2312.00043v1)
Abstract: AI research is increasingly industry-driven, making it crucial to understand company contributions to this field. We compare leading AI companies by research publications, citations, size of training runs, and contributions to algorithmic innovations. Our analysis reveals the substantial role played by Google, OpenAI and Meta. We find that these three companies have been responsible for some of the largest training runs, developed a large fraction of the algorithmic innovations that underpin LLMs, and led in various metrics of citation impact. In contrast, leading Chinese companies such as Tencent and Baidu had a lower impact on many of these metrics compared to US counterparts. We observe many industry labs are pursuing large training runs, and that training runs from relative newcomers -- such as OpenAI and Anthropic -- have matched or surpassed those of long-standing incumbents such as Google. The data reveals a diverse ecosystem of companies steering AI progress, though US labs such as Google, OpenAI and Meta lead across critical metrics.
- The de-democratization of ai: Deep learning and the compute divide in artificial intelligence research, 2020.
- The growing influence of industry in ai research. Science, 379(6635):884–886, 2023.
- Compute trends across three eras of machine learning, 2022.
- A narrowing of ai research?, 2022.
- The Ai Index 2023 Annual Report, 2023. URL https://aiindex.stanford.edu/report/.
- Openalex: A fully-open index of scholarly works, authors, venues, institutions, and concepts, 2022.
- Epoch. Parameter, compute and data trends in machine learning. https://epochai.org/data/pcd, 2022.
- OpenAI. Gpt-4 technical report, 2023.
- Greg Brockman. Microsoft invests in and partners with openai to support us building beneficial agi, 2019. URL https://openai.com/blog/microsoft-invests-in-and-partners-with-openai. (Accessed on: 3rd August 2023).
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Layer normalization, 2016.
- Language models are few-shot learners, 2020.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Emerging Technology Observatory. Country Activity Tracker: Artificial Intelligence, 2023. URL https://cat.eto.tech/. Accessed: 2023-10-20.
- A survey of large language models, 2023.
- Experimental AI corpus from OpenAlex (Version 1) [Data set], 2022. URL https://zenodo.org/records/6997721. Accessed: 2023-06-22.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
- Epoch. Announcing epoch’s updated parameter, compute and data trends database, 2023. URL https://i5kypcdgbm3tkoqx-epoch-website.web.app/blog/announcing-updated-pcd-database. Accessed: 2023-10-30.
- Adam: A method for stochastic optimization, 2017.
- Scaling laws for neural language models, 2020.
- Training compute-optimal large language models, 2022.
- Generating long sequences with sparse transformers, 2019.
- Noam Shazeer. Fast transformer decoding: One write-head is all you need, 2019.
- Adaptive input representations for neural language modeling, 2019.
- Convolutional sequence to sequence learning, 2017.
- Transformer-xl: Attentive language models beyond a fixed-length context, 2019.
- Roformer: Enhanced transformer with rotary position embedding, 2022.
- Noam Shazeer. Glu variants improve transformer, 2020.
- Outrageously large neural networks: The sparsely-gated mixture-of-experts layer, 2017.
- Generating wikipedia by summarizing long sequences, 2018.
- Bert: Pre-training of deep bidirectional transformers for language understanding, 2019.
- Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, 2019.
- Don’t decay the learning rate, increase the batch size, 2018.
- Adafactor: Adaptive learning rates with sublinear memory cost, 2018.
- Mixed precision training, 2018.
- Deep reinforcement learning from human preferences, 2023.
- Asynchronous methods for deep reinforcement learning, 2016.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
- Palm: Scaling language modeling with pathways, 2022.
- Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model, 2022.
- Ernie 3.0 titan: Exploring larger-scale knowledge enhanced pre-training for language understanding and generation, 2021.
- Scaling language models: Methods, analysis & insights from training gopher, 2022.
- Pangu-ΣΣ\Sigmaroman_Σ: Towards trillion parameter language model with sparse heterogeneous computing, 2023.
- Llama: Open and efficient foundation language models, 2023.
- Opt: Open pre-trained transformer language models, 2022.
- Yuan 1.0: Large-scale pre-trained language model in zero-shot and few-shot learning, 2021.
- Competition-level code generation with AlphaCode. Science, 378(6624):1092–1097, dec 2022. doi: 10.1126/science.abq1158. URL https://doi.org/10.1126%2Fscience.abq1158.
- Megatron-lm: Training multi-billion parameter language models using model parallelism, 2020.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Identity mappings in deep residual networks, 2016.
- On layer normalization in the transformer architecture, 2020.
- Andrey Nikolayevich Tikhonov et al. On the stability of inverse problems. In Dokl. akad. nauk sssr, volume 39, pages 195–198, 1943.
- Scaling instruction-finetuned language models, 2022.
- Opt-iml: Scaling language model instruction meta learning through the lens of generalization, 2023.