How to use Task Flows in Microsoft Fabric

Today I want to talk about task flows inside Microsoft Fabric—what they can be used for, how we can change them, how we can add them, and how we can make sense of our projects and make them a bit less messy.

Let’s dive in and find out what tasks are all about.

That Gray Bar Everyone Ignores

If you’re used to working with Microsoft Fabric workspaces, you might see this gray bar here. Usually a workspace will open just like this—with an empty bar in the middle of the screen taking up a lot of space, a lot of screen real estate, and there’s nothing there.

It’s meant for task flows, but by default there are no task flows in a workspace.

What I used to do before I figured out what task flows are all about: I would simply click on the button to move it up and basically delete it from my view so I could actually see my objects—my warehouses, notebooks, pipelines, and so on.

But today we’re going to look at this part of the screen and how we can actually create tasks.

What Are Task Flows?

Task flows are a way to document your process and make sure it’s a bit more clear what the objects in our workspace are actually doing.

We can add tasks to our task flows or create a pre-designed flow. Let’s look at the pre-designed flows first because this will help us tell the story about what task flows are.

What we see is a set of pre-designed task flows that are kind of flowcharts where you can put certain pieces of the puzzle of your data platform in a certain task within a task flow.

What is a task? A task can be a thing like “get data” or “store data” or “visualize data” or “talk to AI” or anything like that.

What this helps you do: if you have larger workspaces with a lot of objects—lakehouses, warehouses, pipelines, notebooks, Power BI reports, and so on—it will help you make sense of what’s actually in your workspace.

Creating a Medallion Task Flow

Let’s begin with creating a pre-designed task flow from scratch. I’m looking at our Kimura internal data warehouse here.

I’m not going to show you any data today—unfortunately, that’s not allowed. But I can show this workspace. I just removed all the task flows so we can use it as a demo.

The reason I’m picking this one and not one of my regular “That Fabric Guy” workspaces is that this is an actual data platform, an actual data warehouse that we run inside Microsoft Fabric. In one workspace, we have all three layers of the medallion architecture. We have notebooks, pipelines, Power BI reports, semantic models, and so on.

So this will be a really good example.

Selecting the Medallion Template

I think the most fitting task flow is the Medallion task flow here. Let’s select that one.

Immediately you see that the screen is populated with a task flow. Going through it, we can see there are a few types of tasks that we can have, and they’re color-coded so we can easily find them.

Once we start adding objects or Fabric artifacts to them, those will be color-coded as well to make our lives in finding stuff a bit easier.

What do we have here? We have “get data” tasks, “store data,” “prepare data” (this is where your notebooks would go, for example), “visualize data” for Power BI, “analyze and train data,” and so on.

Customizing Tasks

Let’s start at the beginning. Click on the “Low volume data ingestion” task. On the right side of the screen, we find the settings of this task.

It’s called “Low volume data ingest.” We can remove it, we can edit it. For example, we can edit the name. Let’s change this to just “Ingest” because I won’t make a distinction between low and high volume.

This is “Ingest data into the data lake.” Save it.

A task type to filter the list of items is recommended—things like “get data,” “mirror data,” and so on. What we could be adding to this task are objects that are either copy jobs, dataflows, notebooks, or Spark job definitions.

We’ll be using notebooks because the ingestion for our lakehouse (or for our data lake) is notebooks.

Let’s remove that “High volume” task. Should be super easy.

Let’s look at bronze data storage, processing silver, transforming, and then gold. That looks all right. Then we have visualize where we can put Power BI stuff—dashboards and reports sounds good.

Then we have ML serving. We can put stuff in there, but we don’t do machine learning right here. So let’s just remove that.

I think for now we have quite a decent task flow here.

Adding Items to Tasks

Looking at our artifacts here, we have a semantic model, a lakehouse, an environment, two reports, and then a folder with pipelines (just two pipelines). Then we have a notebook folder with a maintenance notebook, notebooks for configuration, and notebooks for each of the stages in the medallion architecture.

Let’s start adding stuff to our “Ingest data” task.

Two Ways to Add Items

We can click on “New item” here. There’s this small button where we can create new items that are recommended for this task in the task flow. Then we can actually create a new item that’s immediately joined to this task.

We can also use the attachment button, and here we can start adding attachments.

Let’s go into Bronze and select the ingest notebooks that we’ll use. Just like that, we added three items to my task in the task flow.

Now when I find the items in my object browser, I’ll see that the task connected to this item is the “Ingest” task.

From here, I can click on the task and it will filter down to all the objects regardless of where they’re stored in what folders. It will show me all the objects that are in that task. So it groups them together. That makes it easier to find stuff.

Building Out the Flow

Do we have more ingestion stuff? Yes, we do. We have the orchestration pipeline—though this is not just ingestion, it’s processing as well.

Maybe we should add a new task here that’s more general. Let’s add the pipeline for data source extraction to our ingest step as well.

Now here we can see that once we start clicking on a task, regardless of the location of the objects, it will filter to those objects that we added to the task. We can find the location in here—different folders for notebooks, different folders for pipelines.

Important Limitation: One Task Per Item

Now let’s see what happens if we start adding the same stuff to multiple tasks.

That will not work. I just added it to Bronze, then swapped to Silver, and now these things are called Silver. So what we have to do is remove our Silver and Gold storage objects.

We’re going to create this little flowchart where we have our bronze data, and we’re going to rename that to “Data Storage – Lakehouse for Storage.”

This is just the storage task where we put all of our data. The ingest step puts data in there. Then processing will also interact with our data storage, and it will interact in two directions.

The notebooks (the processing notebooks) will write data into the storage, but they can also read data from storage to do further processing.

Organizing the Layout

From here we can drag connections between tasks. From data storage, we also move into the visualization layer.

It’s a bit messy, but let’s see if we can arrange these things in a nicer way.

I don’t really like how this is laid out. It doesn’t make it really clear what’s going on. Anyway, let’s attach the stuff to my data storage here.

Let’s call the silver processing “Silver Processing – Clean and transform the data to ensure data quality.” That’s what Silver is doing.

For the gold processing: “Gold Processing – The final transformation, highly structured, reliable, analysis-ready data.” This is exactly what my gold notebooks are doing. So that’s actually fine.

Adding All the Pieces

We have a lot of objects that are not in processing stuff at the moment. Let’s start adding all the stuff from Silver. Then for Gold processing, we do the same.

We go to Notebooks, go to Gold, and manually select everything in here. These are the dimensions and the orchestration that we have for Gold.

Then we have another piece to the puzzle: data visualization. It’s doing Power BI stuff. Let’s start adding Power BI and the semantic model in there.

Now we see there are a few different things we’re not using anymore. The semantic model that was joined to the lakehouse is now obsolete—we’re not using it anymore. You can see this is a very old data object.

Maybe the configuration and maintenance notebooks should be in a general task or maybe in a maintenance task. But you get where I’m going with this.

The Final Result

Now we’ve created this labeling that’s in our artifact list. We can now easily find, by simply scanning the tasks here (the colors, basically the labels), that these are storage objects, these are visualization objects.

For pipelines, we find this is a general one, this is an ingestion one, and so on. That makes your life a little bit easier if you’re working with large-scale environments where you have a lot of artifacts, a lot of objects in your workspaces.

Let’s clean up the arrows and connections to make the model look a little bit cleaner. You can select them and hit delete. Just remove these arrows.

You can just click and swap items there. Let’s move this here downstream, make it a left-to-right schema, something like this. This is an easier model to look at.

What We’ve Learned

What we’ve learned today is how we can use task flows to not only visualize the process that our data is going through from left to right when we’re working with data and creating data solutions using Microsoft Fabric.

We’re in a workspace. That workspace contains a lot of different objects, and those objects can be arranged into separate tasks that have to be executed in order to process data all the way from a source ingestion to an analysis step in Power BI or in machine learning models and so on.

What we’ve also done at the same time is not just visualizing that process—it’s also actually tagging the objects so we can easily find them.

By clicking on “Ingest,” we’ll find all the objects that are happening with the ingest. And if I’m just looking at this list right here, I’ll find easily all the tags, all the tasks that are there. I call them tags because it’s basically tagging artifacts in order to find them.

Key Benefits of Task Flows

  • Visual documentation of your data flow from ingestion to visualization
  • Color-coded organization that makes objects easy to find
  • Filtering capability to show only objects in a specific task, regardless of folder location
  • Onboarding tool to help new team members understand your workspace structure

Are you using task flows in your Fabric workspaces? How are you organizing your objects? Let me know in the comments below!

Leave a Comment