Function Calling at the Edge – The Berkeley Artificial Intelligence Research Blog

The ability of LLMs to execute commands through plain language (e.g. English) has enabled agentic systems that can complete a user query by orchestrating the right set of tools (e.g. ToolFormer, Gorilla). This, along with the recent multi-modal efforts such as the GPT-4o or Gemini-1.5 model, has expanded the realm of possibilities with AI agents. […]
Automate Supply Chain Analytics Workflows with AI Agents using n8n

Why build things the hard way when you can design them the smart way? As a Supply Chain Data Scientist, I’ve explored various frameworks like LangChain and LangGraph to build AI agents using Python. Leveraging LLMs with LangChain for Supply Chain Analytics — A Control Tower Powered by GPT — (Image by Samir Saci) The illustration above is from an […]
A Case Study with the StrongREJECT Benchmark – The Berkeley Artificial Intelligence Research Blog

When we began studying jailbreak evaluations, we found a fascinating paper claiming that you could jailbreak frontier LLMs simply by translating forbidden prompts into obscure languages. Excited by this result, we attempted to reproduce it and found something unexpected.
The Ultimate AI/ML Roadmap For Beginners

AI is transforming the way businesses operate, and nearly every company is exploring how to leverage this technology. As a result, the demand for AI and machine learning skills has skyrocketed in recent years. With nearly four years of experience in AI/ML, I’ve decided to create the ultimate guide to help you enter this rapidly […]
Language Models Reinforce Dialect Discrimination – The Berkeley Artificial Intelligence Research Blog

Sample language model responses to different varieties of English and native speaker reactions. ChatGPT does amazingly well at communicating with people in English. But whose English? Only 15% of ChatGPT users are from the US, where Standard American English is the default. But the model is also commonly used in countries and communities where people […]
What Do Machine Learning Engineers Do?
In this article, I want to explain what I do as a machine learning engineer. The aim is to help anyone looking to enter the field gain a truthful view of what a machine learning engineer is, how we work, what we do, and what a typical day in life is like. I hope it […]
A 100-AV Highway Deployment – The Berkeley Artificial Intelligence Research Blog

Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle “stop-and-go” waves, those frustrating slowdowns and speedups that usually have no clear cause but lead to congestion and significant energy waste. To train efficient […]
Build Your Own AI Coding Assistant in JupyterLab with Ollama and Hugging Face

Jupyter AI brings generative AI capabilities right into the interface. Having a local AI assistant ensures privacy, reduces latency, and provides offline functionality, making it a powerful tool for developers. In this article, we’ll learn how to set up a local AI coding assistant in JupyterLab using Jupyter AI, Ollama and Hugging Face. By the […]
Six Organizational Models for Data Science
Introduction Data science teams can operate in myriad ways within a company. These organizational models influence the type of work that the team does, but also the team’s culture, goals, Impact, and overall value to the company. Adopting the wrong organizational model can limit impact, cause delays, and compromise the morale of a team. As […]
Google’s Data Science Agent: Can It Really Do Your Job?

On March 3rd, Google officially rolled out its Data Science Agent to most Colab users for free. This is not something brand new — it was first announced in December last year, but it is now integrated into Colab and made widely accessible. Google says it is “The future of data analysis with Gemini”, stating: […]