How to change Microsoft Fabric Python resources

A couple of weeks ago, I showed you how we can use DuckDB as a Python library in Microsoft Fabric to do Python notebooks on Delta Lake instead of using Spark notebooks. The benefits of using Python over Spark: Python is more lightweight, your session startup times are much lower, and these notebooks use less …

Read more

Vibe Coding in Microsoft Fabric

Can ChatGPT actually help you build a complete data platform in Microsoft Fabric if you pretend to know absolutely nothing? That’s exactly what I set out to test. I gave myself a simple scenario: I’m a data engineer with a SQL database that needs to be transformed into a proper medallion architecture with bronze, silver, …

Read more

Case Sensitivity in Microsoft Fabric Spark

I recently ran into a sneaky issue while migrating a customer from SQL Server to Microsoft Fabric. We were doing what every lazy developer does—copy-pasting code—and suddenly our joins stopped working. Half the data was missing, relationships weren’t connecting, and it took me way too long to figure out why. The culprit? Case sensitivity. SQL …

Read more

OneLake Security Preview in Microsoft Fabric

Recently, Microsoft came out with a private preview for OneLake Security in Microsoft Fabric lakehouses. This is amazing, it is a feature I have been waiting for now for a long time. The promise with Fabric would be that security in the most detailed grain (both row-level and column-level) would be implemented in OneLake. However, …

Read more

Speaking at Fabric Cafe – Don’t Repeat Yourself

I had the pleasure of speaking at Fabric Cafe, an online community aimed at sharing knowledge about Microsoft Fabric. And I’m all in favour of that! My session, titled ‘Don’t Repeat Yourself – how custom Python modules give back hours of your time’, focused on the software engineering principle of DRY, that can be brought …

Read more

Delta Lake Liquid Clustering vs Partitioning

delta lake liquid clustering schema

Introduction to Delta Lake Liquid Clustering As your Delta tables grow in size, the need for performance tuning in Microsoft Fabric becomes essential. In this post, I’ll explore two powerful optimisation techniques — Delta Lake Partitioning and Liquid Clustering. Both can help improve query speed and reduce costs, but they work in very different ways. …

Read more

Extracting Paginating APIs Without NextPage Metadata with Microsoft Fabric Notebooks

Most APIs these days will have some kind of pagination built into them. This is to make sure that queries against the underlying database are not returning too much data, compromising the database performance as well as sending too large messages across the network. Often, these APIs will tell you in their responses how many …

Read more

Notebook Orchestration in Microsoft Fabric

Coming from the ‘old school’ world of SSIS and SQL Server, and later Azure Data Factory and Azure SQL Database, I have always built my ETL orchestration processes using some kind of pipelines. In Fabric, we also have pipelines (the successor of ADF), but, we can now also create notebook orchestration using NotebookUtils and runMultiple(). …

Read more