Cloudflare Docs
Workers AI
Edit this page
Give us feedback
View RSS feed
Set theme to dark (⇧+D)

Changelog

​​ 2024-06-18

​​ Native support for AI Gateways

Workers AI now natively supports AI Gateway.

​​ 2024-06-11

​​ Deprecation announcement for @cf/meta/llama-2-7b-chat-int8

We will be deprecating @cf/meta/llama-2-7b-chat-int8 on 2024-06-30.

Replace the model ID in your code with a new model of your choice:

If you do not switch to a different model by June 30th, we will automatically start returning inference from @cf/meta/llama-3-8b-instruct-awq.

​​ 2024-05-29

​​ Add new public LoRAs and note on LoRA routing

  • Added documentation on new public LoRAs.
  • Noted that you can now run LoRA inference with the base model rather than explicitly calling the -lora version

​​ 2024-05-17

​​ Add OpenAI compatible API endpoints

Added OpenAI compatible API endpoints for /v1/chat/completions and /v1/embeddings. For more details, refer to Configurations.

​​ 2024-04-11

​​ Add AI native binding

  • Added new AI native binding, you can now run models with const resp = await env.AI.run(modelName, inputs)
  • Deprecated @cloudflare/ai npm package. While existing solutions using the @cloudflare/ai package will continue to work, no new Workers AI features will be supported. Moving to native AI bindings is highly recommended