In this blog post, I’m starting a new serie on Fabric in Production. I’ve been having several issues with the deployment of Fabric items in production lately. So I thought I’d share my thoughts on the matter in a series of blog posts.
I haven’t been writing a lot of blog posts lately. Vacations and lots of personal things happened. I’m also doing lots of testing to continue working on the Fabric End to End series which I definitely plan on continuing.
Note: CI/CD stands for Continuous Integration / Continous Delivery. It is a set of practices to ease integration and deployment of applications from one environment to another. Specifically in Microsoft Fabric, CI/CD is handled by Git Integration and Deployment Pipelines.
For the record, I’m also co-hosting a Meetup on Fabric in Production with the Microsoft MVP Charles-Henri SAUGET next week. The video will be shared on the Gentil Développeur Data Platform Youtube channel afterwards. It’ll be in French however.
Project Architecture
On this series of blog posts, we’ll take a pretty classic Microsoft Fabric Architecture. Choices in that architecture may be debatable, the main idea is just to have a wide range of Fabric services to see how CI/CD works (or not) on these services.
Let’s present briefly the architecture.
Ingestion
- An Excel file containing exchange rates is stored in a Sharepoint site will be ingested in a Lakehouse using a DataFlow Gen2
- An Azure SQL DB containing the AdventureWorksLT data (sample database provided by Microsoft) will be ingested in a Lakehouse using a Data Pipeline
- League of Legends data (which I’ve already been using in the Microsoft Fabric end to end project series) will be ingested in a Lakehouse using the League of Legends API from a Notebook
Storage
- Raw data will be ingested in a Fabric Lakehouse
- The Lakehouse may contain additional configuration files
- Modeled data (as a star schema) will be stored in tables in a Fabric Warehouse
Transformation / orchestration
- Transformations are mainly performed in T-SQL stored procedures in the Fabric Warehouse. These stored procedures fetch raw data from the Lakehouse and push transformed data in the Warehouse
- Additional small transformations (renaming of tables/columns for instance) may be done in Warehouse view or in the Semantic Model
- Everything is orchestrated by a master Data Pipeline which :
- Ingests source data into the Lakehouse
- Calls stored procedure to load transformed data in the Warehouse
- Performs a semantic model refresh (even though the semantic model is in Direct Lake, mainly to illustrate CI/CD on semantic models)
- Sends a mail in case an activity failed
Data Visualization
- A Direct Lake semantic model connects to the Warehouse tables and displays these data in a Power BI report
Git integration / workspaces
- We’ll be working with two (or maybe more as we’ll see later) workspaces: a development workspace and a production workspace
- Our development workspace will be connected to the main branch of an Azure DevOps repo
For now, we’ve got everything working in our development workspace. We now want to deploy all of this project in the production workspace. In the following blog posts to come on this series, we’ll dive in each components to see how we can deploy it in production.
Conclusion
This short blog post layed the foundation for the series to come on Fabric in Production blog posts.
Thanks for reading! Stay tuned for part 2!