Blog

All posts are sorted by newest first. The structure is ready for future pagination and filtering by topic.

Implementing Change Data Feed for Incremental Processing in Microsoft Fabric Lakehouses

Most lakehouse pipelines are still doing one expensive thing repeatedly: reprocessing entire tables, even when only a tiny fraction of rows has changed. This post explains the CDF-driven load pattern, why it works, where it can hurt, and how to implement it safely in Microsoft Fabric.

/implementing-change-data-feed-for-incremental-processing-in-microsoft-fabric-lak

Microsoft FabricSpark
Read article

How to Use Concurrent Sessions in Fabric Data Pipelines

Speed up your Fabric pipelines by running notebooks in parallel — without breaking your Spark cluster. Learn how High Concurrency Mode works and how to configure session tags.

/how-to-use-concurrent-sessions-in-fabric-data-pipelines

SparkData Engineering
Read article

GENERATED BY DEFAULT vs GENERATED ALWAYS in Databricks

When defining identity columns in Databricks tables, two common options for automatic identity value generation are GENERATED BY DEFAULT and GENERATED ALWAYS. This article explains the differences and when to use each.

/generated-by-default-vs-generated-always-in-databricks

Data Engineering
Read article

Use Key Vault Secrets in Azure Data Factory

Azure Data Factory can securely access credentials stored in Azure Key Vault. This article describes how to set up key vault secrets for Azure Data Factory linked services.

/use-key-vault-secrets-in-azure-data-factory

AzureData Engineering
Read article

Manage Secret Scopes in Databricks

Databricks can connect to various sources for data ingestion. This article describes how to manage Azure Key Vault-backed secret scopes in Databricks using the GUI.

/manage-secret-scopes-in-databricks

AzureSpark
Read article