Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order (2404.00399v2)

Published 30 Mar 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Pretrained LLMs underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, whereas pretraining from scratch is computationally expensive, and compliance with AI safety and development laws. This paper presents Aurora-M, a 15B parameter multilingual open-source model trained on English, Finnish, Hindi, Japanese, Vietnamese, and code. Continually pretrained from StarCoderPlus on 435 billion additional tokens, Aurora-M surpasses 2 trillion tokens in total training token count. It is the first open-source multilingual model fine-tuned on human-reviewed safety instructions, thus aligning its development not only with conventional red-teaming considerations, but also with the specific concerns articulated in the Biden-Harris Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. Aurora-M is rigorously evaluated across various tasks and languages, demonstrating robustness against catastrophic forgetting and outperforming alternatives in multilingual settings, particularly in safety evaluations. To promote responsible open-source LLM development, Aurora-M and its variants are released at https://huggingface.co/collections/aurora-m/aurora-m-models-65fdfdff62471e09812f5407 .

Introducing Aurora-M: A Multilingual Open-Source LLM Compliant with the Biden-Harris Executive Order on AI Safety

Overview of Aurora-M

The paper introduces Aurora-M, a 15B parameter open-source multilingual LLM that has been continually pretrained on a diverse and extensive dataset. Unlike its predecessors, Aurora-M stands out not only for its multilingual capabilities, which cover English, Finnish, Hindi, Japanese, Vietnamese, and code, but also for its alignment with stringent AI safety and legal standards, specifically the Biden-Harris Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. The model was continually pretrained from the StarCoderPlus model on an additional 435 billion tokens, reaching a staggering total of over 2 trillion tokens. This comprehensive training enables Aurora-M to demonstrate robustness against catastrophic forgetting and superior performance in multilingual settings, particularly in safety evaluations.

Data Curation and Processing

The dataset preparation for Aurora-M involved a two-stage training curriculum, integrating general text data from diverse sources, covering both natural languages and coding languages, along with instruction-tuning datasets. The Continual Auxiliary Pretraining (CAP) stage utilized general web data and multilingual datasets from sources like RefinedWeb and the Pile, while the Continual Alignment Tuning (CAT) stage focused on further boosting its capabilities in specific areas and aligning with safety objectives. Rigorous data filtering techniques were employed to ensure the high quality and relevance of training data, addressing challenges like toxic content removal and sensitive information anonymization.

Training Methodology

Aurora-M's training exploited advanced techniques, including the use of the LUMI supercomputer, mixed precision training, and a carefully optimized learning rate schedule, culminating in a training period of 48 days. This training was not only highly efficient but also environmentally considerate, using 100% hydro-powered energy and incorporating waste heat recycling.

Emphasis on Safety and Legal Compliance

A critical aspect of Aurora-M's development was its instruction-tuning on a carefully curated dataset designed to align with the Biden-Harris Executive Order’s focus areas. This safety consideration is crucial for mitigating risks related to AI applications and ensuring the model’s outputs adhere to accepted ethical and legal standards. The construction of this tailored safety dataset underscores a proactive approach to addressing contemporary concerns regarding AI safety and compliance.

Evaluation and Performance

Aurora-M was subjected to comprehensive evaluations across a range of tasks and languages. Its performance was benchmarked against leading models, showcasing its enhanced capabilities in multilingual language understanding and generation, as well as in coding-related tasks. Notably, Aurora-M demonstrated superior performance in safety evaluations, affirming its commitment to producing legally compliant and ethically sound content.

Contributions and Future Directions

The development of Aurora-M represents a significant step forward in the field of AI research, particularly in fostering open-source LLM development. The model's release is intended to encourage further research and innovation, with its underlying datasets and training methodologies made accessible for community refinement and expansion. Looking ahead, there are plans to explore continual training of Aurora-M on advanced base models and expand its domain-specific expertise, leveraging the insights gained from this project to push the boundaries of AI capabilities while maintaining a steadfast commitment to safety and legal compliance.

In conclusion, Aurora-M embodies a harmonious blend of technical excellence, multilingual inclusivity, and unwavering commitment to safety and ethical AI development. Its introduction paves the way for further advancements in LLM research and applications, promising wider accessibility and responsible innovation in the AI domain.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (128)
  1. Towards a cleaner document-oriented multilingual crawled corpus, 2022.
  2. Program synthesis with large language models, 2021.
  3. Training a helpful and harmless assistant with reinforcement learning from human feedback, 2022.
  4. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity, 2023.
  5. Findings of the 2020 conference on machine translation (WMT20). In Proceedings of WMT, pp.  1–55, 2020. URL https://aclanthology.org/2020.wmt-1.1.
  6. A framework for the evaluation of code generation models. https://github.com/bigcode-project/bigcode-evaluation-harness, 2022.
  7. Red-teaming large language models using chain of utterances for safety-alignment, 2023.
  8. Safety-tuned llamas: Lessons from improving the safety of large language models that follow instructions, 2024.
  9. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pp.  2397–2430. PMLR, 2023.
  10. The foundation model transparency index. arXiv preprint arXiv:2310.12941, 2023.
  11. Foundation model transparency reports. arXiv preprint arXiv:2402.16268, 2024.
  12. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  13. Multipl-e: A scalable and extensible approach to benchmarking neural code generation, 2022.
  14. ERNIE-code: Beyond English-centric cross-lingual pretraining for programming languages. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Findings of the Association for Computational Linguistics: ACL 2023, pp.  10628–10650, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.676. URL https://aclanthology.org/2023.findings-acl.676.
  15. Evaluating large language models trained on code. 2021.
  16. Think you have solved question answering? try arc, the ai2 reasoning challenge. ArXiv, abs/1803.05457, 2018. URL https://api.semanticscholar.org/CorpusID:3922816.
  17. Training Verifiers to Solve Math Word Problems. CoRR, abs/2110.14168, 2021. URL https://arxiv.org/abs/2110.14168.
  18. Unsupervised cross-lingual representation learning at scale, 2020.
  19. Teenytinyllama: open-source tiny language models trained in brazilian portuguese, 2024.
  20. Flashattention: Fast and memory-efficient exact attention with io-awareness, 2022.
  21. A new massive multilingual dataset for high-performance language technologies, 2024.
  22. Assessing language model deployment with risk cards, 2023.
  23. Enhancing chat language models by scaling high-quality instructional conversations, 2023.
  24. Wikimedia Foundation. Wikimedia downloads. URL https://dumps.wikimedia.org.
  25. Indictrans2: Towards high-quality and accessible machine translation models for all 22 scheduled indian languages, 2023.
  26. Airavata: Introducing hindi instruction-tuned llm, 2024.
  27. Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned, 2022.
  28. The Pile: An 800GB dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027, 2020.
  29. EleutherAI/lm-evaluation-harness: v0.3.0, December 2022. URL https://doi.org/10.5281/zenodo.7413426.
  30. Mart: Improving llm safety with multi-round automatic red-teaming, 2023.
  31. Exploring paracrawl for document-level neural machine translation, 2023.
  32. Olmo: Accelerating the science of language models. arXiv preprint arXiv:2402.00838, 2024.
  33. Continual pre-training of large language models: How to (re)warm your model?, 2023.
  34. Don’t stop pretraining: Adapt language models to domains and tasks, 2020.
  35. llm-jp-eval: Automatic evaluation tool for Japanese large language models [llm-jp-eval: 日 本語大規模言語モデルの自動評価ツール] (in Japanese). In the 30th Annual Meeting of Japanese Association for Natural Language Processing (NLP2024), 2024. URL https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/A8-2.pdf.
  36. XL-sum: Large-scale multilingual abstractive summarization for 44 languages. In Findings of the Association for Computational Linguistics (ACL), pp.  4693–4703, 2021. doi: 10.18653/v1/2021.findings-acl.413. URL https://aclanthology.org/2021.findings-acl.413.
  37. Measuring massive multitask language understanding, 2021a.
  38. Cuad: An expert-annotated nlp dataset for legal contract review, 2021b.
  39. How good are gpt models at machine translation? a comprehensive evaluation, 2023.
  40. Not all languages are created equal in llms: Improving multilingual capability by cross-lingual-thought prompting, 2023.
  41. Simple and scalable strategies to continually pre-train large language models, 2024.
  42. Construction of a Japanese multi-hop QA dataset for QA systems capable of explaining the rationale [根拠を説明可能な質問応答システムのための日本語マルチホップqaデータセット構築] (in Japanese). In The 29th Annual Meeting of Japanese Association for Natural Language Processing (NLP2023), pp.  2088–2093, March 2023. URL https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/Q8-14.pdf.
  43. Camels in a changing climate: Enhancing lm adaptation with tulu 2, 2023.
  44. Is chatgpt a good translator? yes with gpt-4 as the engine, 2023.
  45. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In Regina Barzilay and Min-Yen Kan (eds.), Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  1601–1611, Vancouver, Canada, July 2017. Association for Computational Linguistics. doi: 10.18653/v1/P17-1147. URL https://aclanthology.org/P17-1147.
  46. Continual training of language models for few-shot learning. arXiv preprint arXiv:2210.05549, 2022.
  47. Adapting a language model while preserving its general knowledge. arXiv preprint arXiv:2301.08986, 2023.
  48. Turning english-centric llms into polyglots: How much multilinguality is needed?, 2023.
  49. Adam: A method for stochastic optimization, 2017.
  50. Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback, 2023.
  51. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, 2017. ISSN 0027-8424. doi: 10.1073/pnas.1611835114. URL https://www.pnas.org/content/114/13/3521.
  52. The stack: 3 tb of permissively licensed source code, 2022.
  53. JGLUE: Japanese general language understanding evaluation. In Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, and Stelios Piperidis (eds.), Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp.  2957–2966, Marseille, France, June 2022. European Language Resources Association. URL https://aclanthology.org/2022.lrec-1.317.
  54. Openassistant conversations – democratizing large language model alignment, 2023.
  55. LAION. Oig: the open instruction generalist dataset", 2023.
  56. Starcoder: may the source be with you!, 2023a.
  57. Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training. In Proceedings of the 52nd International Conference on Parallel Processing, ICPP ’23, pp.  766–775, New York, NY, USA, 2023b. Association for Computing Machinery. ISBN 9798400708435. doi: 10.1145/3605573.3605613. URL https://doi.org/10.1145/3605573.3605613.
  58. TruthfulQA: Measuring how models mimic human falsehoods. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  3214–3252, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.229. URL https://aclanthology.org/2022.acl-long.229.
  59. Few-shot learning with multilingual language models. arXiv preprint arXiv:2112.10668, 2021.
  60. A safe harbor for ai evaluation and red teaming. arXiv preprint arXiv:2403.04893, 2024.
  61. Gradient episodic memory for continual learning. In Advances in Neural Information Processing Systems, pp.  6467–6476, 2017.
  62. Decoupled weight decay regularization, 2019.
  63. Starcoder 2 and the stack v2: The next generation, 2024.
  64. Wizardcoder: Empowering code large language models with evol-instruct, 2023.
  65. FinGPT: Large generative models for a small language. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.  2710–2726, Singapore, December 2023a. Association for Computational Linguistics. doi: 10.18653/v1/2023.emnlp-main.164. URL https://aclanthology.org/2023.emnlp-main.164.
  66. FinGPT: Large generative models for a small language. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.  2710–2726. Association for Computational Linguistics, 2023b. doi: 10.18653/v1/2023.emnlp-main.164. URL https://aclanthology.org/2023.emnlp-main.164.
  67. Mixed precision training, 2018.
  68. Can a suit of armor conduct electricity? A new dataset for open book question answering. In Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun’ichi Tsujii (eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp.  2381–2391, Brussels, Belgium, October-November 2018. Association for Computational Linguistics. doi: 10.18653/v1/D18-1260. URL https://aclanthology.org/D18-1260.
  69. Prompting with pseudo-code instructions. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.  15178–15197, Singapore, December 2023a. Association for Computational Linguistics. doi: 10.18653/v1/2023.emnlp-main.939. URL https://aclanthology.org/2023.emnlp-main.939.
  70. Cross-task generalization via natural language crowdsourcing instructions, 2022a.
  71. Cross-task generalization via natural language crowdsourcing instructions. In ACL, 2022b.
  72. Lila: A unified benchmark for mathematical reasoning, 2023b.
  73. Crosslingual generalization through multitask finetuning. arXiv preprint arXiv:2211.01786, 2022.
  74. Octopack: Instruction tuning code large language models. arXiv preprint arXiv:2308.07124, 2023a.
  75. Crosslingual generalization through multitask finetuning, 2023b.
  76. Auditing large language models: a three-layered approach. AI and Ethics, May 2023. ISSN 2730-5961. doi: 10.1007/s43681-023-00289-2. URL http://dx.doi.org/10.1007/s43681-023-00289-2.
  77. Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp.  1–15, 2021.
  78. Vinallama: Llama-based vietnamese foundation model, 2023a.
  79. Culturax: A cleaned, enormous, and multilingual dataset for large language models in 167 languages. arXiv preprint arXiv:2309.09400, 2023b.
  80. Gpt-4 technical report, 2023.
  81. Asynchronous pipelines for processing huge corpora on medium to low resource infrastructures. 07 2019. doi: 10.14618/IDS-PUB-9021.
  82. Gorilla: Large language model connected with massive apis, 2023.
  83. The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. arXiv preprint arXiv:2306.01116, 2023. URL https://arxiv.org/abs/2306.01116.
  84. Humaneval-xl: A multilingual code generation benchmark for cross-lingual natural language generalization. CoRR, abs/2402.16694, 2024. doi: 10.48550/ARXIV.2402.16694. URL https://doi.org/10.48550/arXiv.2402.16694.
  85. Red teaming language models with language models, 2022.
  86. How multilingual is multilingual bert? arXiv preprint arXiv:1906.01502, 2019.
  87. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv e-prints, 2019.
  88. Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp.  784–789, 2018. doi: 10.18653/V1/P18-2124. URL https://aclanthology.org/P18-2124/.
  89. Fair enough: How can we develop and assess a fair-compliant dataset for large language models’ training?, 2024.
  90. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp.  2001–2010, 2017.
  91. Mark B Ring. Child: A first step towards continual learning. In Learning to learn, pp.  261–292. Springer, 1998.
  92. Chatgpt mt: Competitive for high- (but not low-) resource languages, 2023.
  93. Multilingual and zero-shot is closing in on monolingual web register classification. In Simon Dobnik and Lilja Øvrelid (eds.), Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pp.  157–165, Reykjavik, Iceland (Online), May 31–2 June 2021. Linköping University Electronic Press, Sweden. URL https://aclanthology.org/2021.nodalida-main.16.
  94. Code llama: Open foundation models for code, 2024.
  95. Bloom: A 176b-parameter open-access multilingual language model, 2023.
  96. Satoshi Sekine. Development of a question answering system focused on an encyclopedia [百科事典を対象とした質問応答システムの開発] (in Japanese). In the 9th Annual Meeting of Japanese Association for Natural Language Processing (NLP2003), pp.  637–640, 2003. URL https://www.anlp.jp/proceedings/annual_meeting/2003/pdf_dir/C7-6.pdf.
  97. Multilingual instruction tuning with just a pinch of multilinguality, 2024.
  98. Language models are multilingual chain-of-thought reasoners. In the Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=fR3wGCk-IXp.
  99. mgpt: Few-shot learners go multilingual. arXiv preprint arXiv:2204.07580, 2022.
  100. Aya dataset: An open-access collection for multilingual instruction tuning. arXiv preprint arXiv:2402.06619, 2024.
  101. Dolma: An open corpus of three trillion tokens for language model pretraining research. arXiv preprint arXiv:2402.00159, 2024.
  102. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models, 2023.
  103. Safety assessment of chinese large language models, 2023.
  104. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085, 2022.
  105. Sebastian Thrun. Lifelong learning algorithms. In Learning to learn, pp.  181–209. Springer, 1998.
  106. It’s all in the heads: Using attention heads as a baseline for cross-lingual transfer in commonsense reasoning. In Findings of the Association for Computational Linguistics, pp.  3534–3546, 2021. doi: 10.18653/V1/2021.FINDINGS-ACL.310. URL https://doi.org/10.18653/v1/2021.findings-acl.310.
  107. Together. Redpajama: An open source recipe to reproduce llama training dataset, 04 2023. URL https://github.com/togethercomputer/RedPajama-Data.
  108. Llama 2: Open foundation and fine-tuned chat models, 2023.
  109. Aya model: An instruction finetuned open-access multilingual language model. arXiv preprint arXiv:2402.07827, 2024.
  110. Helpsteer: Multi-attribute helpfulness dataset for steerlm, 2023.
  111. Polylm: An open source polyglot large language model. arXiv preprint arXiv:2307.06018, 2023.
  112. WhiteHouse. Fact sheet: President biden issues executive order on safe, secure, and trustworthy artificial intelligence, 2023. Accessed: March 13, 2024.
  113. Wizardlm: Empowering large language models to follow complex instructions, 2023.
  114. Exclusive supermask subnetwork training for continual learning. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Findings of the Association for Computational Linguistics: ACL 2023, pp.  569–587, Toronto, Canada, July 2023a. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.36. URL https://aclanthology.org/2023.findings-acl.36.
  115. Exploring continual learning for code generation models. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp.  782–792, Toronto, Canada, July 2023b. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-short.68. URL https://aclanthology.org/2023.acl-short.68.
  116. TIES-merging: Resolving interference when merging models. In Thirty-seventh Conference on Neural Information Processing Systems, 2023c. URL https://openreview.net/forum?id=xtaX3WyCj1.
  117. Bigtranslate: Augmenting large language models with multilingual translation capability over 100 languages, 2023.
  118. Investigating continual pretraining in large language models: Insights and implications. arXiv preprint arXiv:2402.17400, 2024.
  119. Metamath: Bootstrap your own mathematical questions for large language models, 2023.
  120. HellaSwag: Can a machine really finish your sentence? In Anna Korhonen, David Traum, and Lluís Màrquez (eds.), Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.  4791–4800, Florence, Italy, July 2019. Association for Computational Linguistics. doi: 10.18653/v1/P19-1472. URL https://aclanthology.org/P19-1472.
  121. Continual learning through synaptic intelligence. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp.  3987–3995. JMLR. org, 2017.
  122. M3exam: A multilingual, multimodal, multilevel benchmark for examining large language models. Advances in Neural Information Processing Systems, 36, 2024.
  123. Safetybench: Evaluating the safety of large language models with multiple choice questions. arXiv preprint arXiv:2309.07045, 2023.
  124. Multimodal c4: An open, billion-scale corpus of images interleaved with text, 2023a.
  125. Extrapolating large language models to non-english by aligning languages. arXiv preprint arXiv:2308.04948, 2023b.
  126. Toolqa: A dataset for llm question answering with external tools, 2023.
  127. Red teaming chatgpt via jailbreaking: Bias, robustness, reliability and toxicity. arXiv preprint arXiv:2301.12867, 2023.
  128. Astraios: Parameter-efficient instruction tuning code large language models. https://arxiv.org/abs/2401.00788, 2024.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (45)
  1. Taishi Nakamura (11 papers)
  2. Mayank Mishra (38 papers)
  3. Simone Tedeschi (9 papers)
  4. Yekun Chai (18 papers)
  5. Jason T Stillerman (1 paper)
  6. Felix Friedrich (40 papers)
  7. Prateek Yadav (24 papers)
  8. Tanmay Laud (7 papers)
  9. Vu Minh Chien (4 papers)
  10. Terry Yue Zhuo (32 papers)
  11. Diganta Misra (17 papers)
  12. Ben Bogin (22 papers)
  13. Xuan-Son Vu (15 papers)
  14. Marzena Karpinska (19 papers)
  15. Arnav Varma Dantuluri (1 paper)
  16. Wojciech Kusa (16 papers)
  17. Tommaso Furlanello (10 papers)
  18. Rio Yokota (64 papers)
  19. Niklas Muennighoff (56 papers)
  20. Suhas Pai (3 papers)
Citations (5)
Youtube Logo Streamline Icon: https://streamlinehq.com