What’s the maximum number of tokens you can handle?

Hey,

What’s the maximum number of tokens your Large model can process? Is it 2048?

Also, when was the cutoff for the last dataset that the model was trained on? Just so I know how current the information it uses is.

Thx

Yes exactly, 2048 tokens max limit for large and medium. 1024 tokens for small and embedding (at the moment).

Thank you Jay.
I just noticed that your documentation only mentions 1024 here: API Documentation | Cohere AI

Worth updating.

Can you please also comment on my other question around how recent the dataset that the Large model was trained on?

Many thanks,
Rami

Thanks for the heads up, Rami! We’ll get that fixed!

The cutoff date for the training dataset is around March 2021.

Hi @jay

I’m struggling to get any current affairs from the year 2021. I don’t believe the cutoff date for the training dataset is March 2021.

For example, when a prompt is created to list all the US presidents, it can list all of them up until Donald Trump. But it will struggle to include Joe Biden. This tells me the dataset it was trained on is prior to January 2021. Can you please elaborate?

If you could ask internally and find out the actual cutoff date for me that’d be helpful . It’s quite important please.

thanks,
Rami

Hey Rami. March 2021 is indeed the actual date, but perhaps three months of crawl data in 2021 is a small sample for the prompt you’re trying.

For Jan 2021, this prompt mostly generates “Joe Biden” with the Large model:

The inauguration of the 46th president of the United States took place on January 20, 2021, marking the start of the four-year term of

For Feb 2021, this prompt mostly generates “Jezero Crater” with the Large model:

February 18 – NASA’s Mars 2020 mission (containing the Perseverance rover and Ingenuity helicopter drone) lands on Mars at