Background image for aesthetic purposes

Cohere Labs Scholars Program

Background image for aesthetic purposes

Your Research Journey Starts Here

As a research scholar, you will contribute to state-of-the-art research on frontier AI, gaining access to a top-tier industry research lab at the beginning of your career.

You will become part of our dedicated team of passionate researchers and industry experts. Paired with a project proposal and top-tier researchers, this experience will facilitate your growth as a researcher. The program entails full-time, paid participation, providing scholars access to a large-scale experimental framework and world-class research mentors.


About the program

Since 2023, we’re proud to have welcomed three cohorts to the Cohere Labs Scholars Program to help close the gap between research experience and opportunity. We have supported talented researchers from around the world to work on open-ended questions with our in-house research team. We are incredibly proud of the many scientific contributions these scholars have made during the program.

Machine learning is advancing rapidly, but limited resources restrict fundamental NLP research and large-scale ML experiments. To foster new developments, it is crucial to broaden access and encourage participation in fundamental research. Our mission is to address these gaps and empower the next generation of rising stars as they venture into their research endeavors.

Since 2023, we’ve been proud to run three cohorts of the Cohere Labs Scholars Program to help close the gap between research experience and opportunity. In this time, we have welcomed talented researchers from around the world to explore open-ended questions with our in-house experienced research team.

21

Research Scholars to Date

9

Countries Represented Among Scholars

25

Published Papers

25+

Project Mentors

Papers Published by Our Scholars

Featured image for article

When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs

We study robust scaling for open-ended generative tasks in multilingual, multi-task settings. Our findings show that sampling and selection strategies must adapt to diverse domains and languages. We propose novel strategies, yielding notable gains across languages and tasks.

Featured image for article

One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers

Using a universal tokenizer trained for more languages than the primary pretraining languages significantly improves language plasticity, enabling up to 20.2% higher adaptation rates to new languages post-training, with minimal performance compromise on pretraining languages.

Featured image for article

RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs

We introduce a novel, scalable method for generating high-quality multilingual feedback data to balance data coverage. We establish the benefits of cross-lingual transfer and increased dataset size in preference training.

Featured image for article

LLM See, LLM Do: Guiding Data Generation to Target Non-Differentiable Objectives

Our work exhaustively characterizes the impact of passive inheritance of model properties by systematically studying the consequences of synthetic data integration.

Featured image for article

The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm

We explore the viability of different alignment approaches when balancing dual objectives: addressing and optimizing for a non-homogeneous set of languages and cultural preferences while minimizing both global and local harms.

Featured image for article

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Including code in the pre-training data mixture, even for models not specifically designed for code, has become a common practice in LLMs pre-training. e ask "what is the impact of code data used in pre-training on a large variety of downstream tasks beyond code generation".

Featured image for article

Multilingual Arbitrage: Optimizing Data Pools to Accelerate Multilingual Progress

Can you surpass individual model performance by sampling parts of the distribution strategically from a pool of models? We introduce “multilingual arbitrage” to describe capitalizing on performance variations to produce large gains in performance.

Learn More

Featured image for article

Scholars Program: Research Journeys Start Here

Blog post sharing details about our most recent recruitment cycle.

Featured image for article

Reflecting on the journey

2024 Scholar, Luísa Shimabucoro, reflects on her experience

Featured image for article

Article in Fortune announcing the Scholars Program

Expanding access might be the key to resolving the A.I. talent shortage

Frequently Asked Questions

  • When can I apply for the Scholars Program?
    • Applications open each year in August; the Scholars Program begins annually in January.

  • Do I need previous research experience to apply to the Scholars Program?
    • No. In fact, we are particularly interested in identifying candidates with strong engineering skills and demonstrated creative thinking but limited experience or no experience with published papers. You need not have to have previous research experience to apply.

  • Do I need to submit a research project proposal work on during the Scholars Program?
    • We want you to focus on showcasing your skills in the application process, so there is no need to propose a research project. We will match our Scholars to research projects and mentors.

  • How does project matching work?
    • We will consider the skills, interests, and background of our successful applicants alongside our research projects to make the best possible match between Scholar and project.

  • Is the Scholars Program paid?
    • Yes, this is a full-time, paid position.

  • Is the Scholars Program remote, or will I need to relocate to a Cohere office?
    • The Scholars Program will operate on a remote-first basis. We want to make this opportunity available to emerging ML researchers around the world. If you are based out of or visiting a city with a Cohere office you are welcome to come work with us in person.

  • Will Cohere Labs sponsor work permits?
    • The Scholars Program is structured as a remote-first opportunity, with no requirement to relocate.

  • Is it possible to do the Scholars Program on a part-time basis?
    • The Scholars Program is structured as a rigorous full-time commitment. In an effort to keep up with the pace of advancement of ML research, we look forward to supporting Scholars dive into the research process as they work towards publication within the program period.

  • I was unsuccessful in my application–can I reapply for future Scholars Programs?
    • Yes–we welcome re-applications for future cohorts. We see this as a positive signal of your continued interest and commitment to the program.

Background image for aesthetic purposes

Cohere Labs

Cohere Labs is Cohere's research lab that seeks to solve complex machine learning problems. We support fundamental research that explores the unknown, and are focused on creating more points of entry into machine learning research.