Agent Bricks: How Databricks Is Making Data Intelligence Production-Ready for Everyone

With the recent explosion of AI models and the rapid ongoing innovation in that space, enterprises are trying to make use of the technology to gain a competitive edge. But while base AI models offer general capabilities, they fall short when it comes to deeply understanding proprietary data, workflows, and domain-specific nuances. Enterprises need intelligent agents that capitalize on their most valuable resource – their data. They don’t just need Artificial Intelligence; they need Data Intelligence. This could be for many reasons, including building innovative products, getting quick insights for better decision making, and for improving productivity.

However, the task of augmenting base models with proprietary data is not easy and the path to production remains treacherous. A staggering 90% of enterprise Gen AI projects fail to reach production. Why? The reasons may be familiar:

  • Too many variables: Tuning models and pipelines requires deep expertise. There are many choices to make, such as the base model, embedding model, chunk strategy, and architecture decisions.
  • Vibe-check quality: Evaluation is often manual, subjective, and inconsistent. In fact, 85% of users still manually inspect agent outputs and they often do so without a clear measure of what “good” is.
  • Slow or expensive deployment: Scaling agents reliably across environments is a major hurdle. The balance of cost and quality can be hard to get right.

To address these challenges, Databricks have introduced Agent Bricks.

Agent Bricks: Data Intelligence Made Easy 

Agent Bricks is a no-code platform for building and deploying AI agents tailored to enterprise use cases. Powered by Mosaic AI, it abstracts away the complexity of model customisation, evaluation, and deployment, so teams can focus on outcomes, not infrastructure.

Here’s what makes Agent Bricks stand out:

  • Bring Your Own Data: Base models aren’t enough. Agent Bricks lets you ground agents in your proprietary data for domain-specific intelligence.
  • No-Code Interface: Build agents without writing a single line of code, perfect for cross-functional teams.
  • Self-Improving Agents: Agents continuously improve over time using natural language feedback.
  • Production-Ready Deployment: Deploy to Mosaic endpoints or Lakeflow Spark Declarative Pipelines in a matter of clicks. 

Real-World Use Cases 

Agent Bricks makes it quick and easy to create AI agents for some of the most common tasks. The currently use cases are:

  • Information Extraction: Convert documents into structured outputs for downstream analytics. This involves two steps – firstly the document is parsed using the ai_parse_document function which extracts structured content from unstructured data. The second step uses a model to present the data in the specified or suggested format. Supported file formats are .pdf, .jpg, .jpeg, .png, .doc, .docx, .ppt, and .pptx.
  • Custom LLMs: Tailor models to specific tasks like summarization, classification, or insight generation.
  • Knowledge Assistants: Build chatbots that truly understand your internal data and documentation.
  • Multi-Agent Supervision: Coordinate multiple agents to handle complex workflows with inter-agent communication. 

Model Evaluation 

One of the biggest blockers to deploying to production is evaluation. It is often difficult to define exactly what “good” or “good enough” means in the context of AI agent performance. To help with this, Agent Bricks provides the following:

  • Automatic Benchmarking: It generates evaluation benchmarks automatically, removing guesswork. It does this by generating synthetic data and by using LLMs as a judge.
  • Automatic Optimization: It loops through and combines various optimisation techniques for the model.
  • Natural Language Feedback Loops: Improve agents using plain language – no need for labelled datasets.
  • MLflow Integration: Under the hood, Agent Bricks leverages MLflow, the most widely adopted open-source MLOps platform.

Seamless Deployment 

Once your agent is ready, it is deployed to a Mosaic AI endpoint. This is perfect for batch or real-time workloads. It is also easy to integrate the model in a Lakeflow Spark Declarative Pipeline, in fact, it is possible to create a pipeline with calls to the deployed model in a couple of clicks for the Agents UI.

Information Extraction Walkthrough 

I tried out the Information Extraction task for myself to see just how easy it is to transform documents into structured data. For this test, I used a PDF file of the menu of a local restaurant. The following is a step-by-step replay of what I did to deploy an agent all the way to production.

Firstly, I created a catalog, a schema, a volume and then uploaded the PDF file into the volume.

I then navigated to Agents in the sidebar and selected the Information Extraction task, with the first subtask being the Use PDFs step. In the UI, I specified the volume directory where the PDF was stored and selected the schema and table name under which to save the extracted text. Extracting the data took a matter of seconds with a serverless SQL warehouse.

I then selected Build on the task, named my model, selected unlabelled dataset and specified the table and column containing the extracted text. This defined my source data.

I wanted the output of the model to be presented in a particular format, so I provided an example JSON, however, Agent Bricks intelligently suggests an output format, which in this case was very similar to what I wanted. With just some small modifications, I was able to specify the output format. The agent is then created in a matter of minutes. 

After creation, the model can be tuned by fixing responses. The user can provide additional instructions, more details about the fields to extract to aid the model and provide feedback on the accuracy of the model responses. The model can be redeployed and iteratively improved. In my case, I gave feedback on the structure of the fields to be extracted, specifically that I wanted the ingredients to be given as a comma-separated string of the different items.

Once I was satisfied that the model was in a ready state, I was able to deploy it to a Lakeflow Spark Declarative Pipeline in two clicks. With that, my model was deployed and could be used in production for streaming or batch workloads. I ran the pipeline and inspected the results to find that the agent had performed as expected. 

This was the extent of my simple test with Agent Bricks but it highlighted to me how easy it would be to incorporate this model into an Autoloader style pipeline that could automatically and incrementally ingest unstructured data uploaded to cloud storage, extract the text, and then store the key information in a table for downstream analytics or use cases.

From the creation of the catalogs to the deployment of the agent in an SPD, the whole process took approximately ten to fifteen minutes and involved no lines of code. Having never used Agent Bricks before, I was very impressed by its capabilities. I believe this could be transformational for enterprises of all types, including those with limited technical experience in AI or even programming.  Databricks as a platform has been democratising data for a long time, but this is their biggest step towards democratising AI so far.

Try it yourself

Agent Bricks is now available in beta for selected regions (i.e., the US). Ask your workspace administrator to enable the feature in the Previews menu of the workspace settings. This should make Agents available in the AI/ML section of the sidebar.   

Jordan Begg

Jordan Begg
Senior Data Engineer