Conversational AI with Microsoft Fabric Data Agents

Fabric Data Agents are really cool. You can talk to your data. You don’t need to have a pre-built dashboard anymore—you can just chat with your data model.

And the best part? I can show you how to set this up and actually use it. Let’s dive in.

Setting Up Your Semantic Model

For this demo, I’m using a dataset I’ve created with New York City taxi data. I’ve used it before in other videos as well, and I’ve created a semantic model with it.

The model has a taxi trips fact table and some dimensions. The taxi type (is it the yellow or green taxi?), the pickup and drop-off locations, dates, and times.

Looking at the semantic model itself, we have a properly created star schema. That should be enough for Fabric Data Agents to work with.

Creating DAX Measures

The only thing is that in the taxi trips fact table, we don’t have any DAX measures—any explicit measures. So let’s create a few, especially on the trip distance and passenger count, so we can easily analyze those.

First, a new measure for number of passengers:

Number of Passengers = SUM('Taxi Trips'[Passenger Count])

And another measure for trip distance:

Trip Distance = SUM('Taxi Trips'[Trip Distance])

I’ve also created a very simple Power BI validation report that we can use to find numbers and validate our dataset against what the agent tells us.

Creating Your Data Agent

Let’s start creating our data agent. Click on “New item”, go to “All items”, and search for “Data Agent”. It’s still in preview, but we can use it.

I’ll name it “TaxiAnalysis” (you cannot put spaces in there, apparently). Click on create, and the agent will be loaded.

Fabric is telling us to add data sources first. So let’s click on “Add data sources”. We can add semantic models, lakehouses, and so on. We’ll add a semantic model for today.

The data agent will check if we actually properly did that. Of course we did—there’s no issue there.

Now we can start selecting tables from that semantic model that we want to expose to the agent. I’m happy to expose all of them.

Asking Your First Question

We can start asking questions now to the agent. Let’s try this:

“In 2023, what was the total number of passengers that took green taxis?”

Fabric is analyzing the data. As you can see, it now understands it needs to go into the semantic model. I can open the DAX query that’s being created by the agent.

The filter for calendar year 2023 looks fine. Number of passengers—that’s also fine. It’s picking the proper measure we just created. But for taxi types, I’m not exactly sure whether “green taxis” is actually what we need in the filter.

Let’s check the validation report. Looking at the taxi types dimension, we have “green” and not “green taxi”, right? That’s the taxi type. So Fabric isn’t really understanding that properly.

Adding AI Instructions

We need to give the AI some instructions. This is where you can help guide the agent to understand your data better.

I’ll add this instruction:

“When I talk about taxi types, I want you to find the available types in the taxi types table, please.”

Now, let’s ask again: “Can you show me the number of passengers that traveled with green taxis in 2023?”

As you can see, this is still a work in progress. There’s no AI that we can rely on right now to just do anything and expect it to know everything about your business and your data. We need to help it a little bit.

But because we put in the AI instructions, we got the answer: in 2023, a total of 946,072 passengers traveled with green taxis.

We can open the completed steps and see the DAX query that has been sent to the Power BI semantic model. In this case, we see that the taxi type filter includes “green”. That’s actually cool.

Validating the Results

Let’s validate that answer. I’ll go to my validation report, put the calendar year filter to 2023, add the taxi type filter for “green”, and look at the number of passengers measure.

This might take a while because there are literally billions of rows in this table. Actually, that was quite fast.

We find that in the green taxi category, there were 946,072 passengers in 2023. Our AI model understood the question and showed me the proper results. That’s really amazing.

More Complex Questions

Let’s try something more open-ended:

“Can you tell me in 2024 what were the most popular pickup locations in which boroughs and zones?”

This is a more open question. I’m not trying to pinpoint an exact number—I’m just trying to find what the most popular pickup location was.

The AI agent is actually understanding my question. It’s parsing the natural language as AI models do. I’m sure you’re familiar with ChatGPT, Copilot, Claude, and so on.

Queries are being created by the model. That’s basically what Fabric Data Agents is doing—it’s treating your natural language query (the prompt that you give it) and then changing it to a DAX query that can be executed to give you an answer.

We got an answer. For Queens, these were the most popular. For Manhattan, these were the most popular. That’s quite a lot.

Follow-Up Questions

Let’s narrow this down. I always say thank you to my AI agents, by the way.

“Now give me the number one zone per borough and include the number of pickups in 2024.”

You can use this to dive into your data. Let’s assume in the beginning I wasn’t sure—I wanted to find the number one zone per borough and the number of trips. But now that I’m looking at this, this would be a follow-up question that I could ask.

We got an answer. For Bronx, it was Mott Haven/Port Morris with 1.4 million pickups. Manhattan would be Midtown Center with 4.8 million pickups. This is actually amazing.

Year-Over-Year Comparisons

Let’s try one more before wrapping up:

“Are there any changes in the top one from 2023 to 2024?”

Let’s see if it can figure out that it now needs to do a year-over-year comparison and then find if there are changes or not. I only want to see the changes here. I hope the model can assume that’s my actual question.

With ChatGPT these days, it will assume things usually correctly. And it tells me there will be no changes in the top one pickup zones per borough from 2023 to 2024. For both years, the leading zones remained the same. The number of pickups changed slightly, but the top pickup zones stayed the same in each borough.

I can validate the code it generated to give me this answer, and I always do that. We can look at the completed steps and find the code. It shows the pickups by borough and zone for 2023. That looks good.

Interestingly, I don’t find 2024 in the code. Maybe it remembers the previous answer for 2024. That would actually be a smart thing for the model to do—it doesn’t need to compute answers that it already knows.

I think it understood that these are the 2024 numbers from the previous query, then created the 2023 numbers, and showed that there were no differences.

Publishing Your Data Agent

I can add example queries if I want to share this data agent with my colleagues. I won’t do that right now, but it’s an option.

Then I can publish the model. I’ll add a description: “This model can answer your questions on taxi trips.”

Click publish. Now everybody with access to this workspace or to this model can use it to start asking questions. That’s actually quite cool.

Of course, we can set permissions here. We can do everything we want to control access.

Why This Matters

This is very straightforward. I’m excited about this.

I have customers that say, “I can use ChatGPT to answer questions about my business. I can upload PDF files with manuals, with documents, and start asking questions about them instead of looking manually through things. Why can’t I do that with Power BI?”

Actually, you can—with Fabric Data Agents. As of today, this is still in preview, but it works as you can see, and it’s actually quite accurate as well.

Key Takeaways

Fabric Data Agents let you have natural language conversations with your data. You don’t need pre-built dashboards for every question someone might have.

The AI needs some help through instructions to understand your specific business context and data structure. But once you set that up, it can handle complex queries, follow-up questions, and even year-over-year comparisons.

Always validate the results, especially in the beginning. Check the generated DAX queries to understand what the agent is actually doing behind the scenes.

Are you planning to use Fabric Data Agents in your environment? What questions would you ask your data? Let me know in the comments below!

Leave a Comment