Run Spark SQL in Fabric Notebooks Without Attaching a Default Lakehouse

How to keep your Spark jobs portable, safe for concurrency, and easier to orchestrate by using explicit ABFSS paths instead of attaching a default Lakehouse in Microsoft Fabric.

How to keep your Spark jobs portable, safe for concurrency, and easier to orchestrate.

#Introduction

In Microsoft Fabric, notebooks typically expect a default Lakehouse to be attached when you use Spark SQL. But you don't have to do that. Instead, you can:

Read/write Delta tables via explicit ABFSS paths (OneLake),
Register temporary views for SQL,
Keep your notebook completely decoupled from any default Lakehouse.

This pattern is great for concurrent job execution, reduces accidental writes to the wrong Lakehouse, and makes notebooks portable across workspaces. Below is a dummy/friendly reference you can paste into your Fabric notebook.

#Why avoid a default Lakehouse?

Attaching a default Lakehouse is convenient for ad‑hoc exploration, but it can become a liability in production. Here's why:

#1. Concurrency and Spark Session Reuse in Fabric

Fabric uses Spark session pooling to optimize performance when running multiple notebooks in parallel:

If all notebooks share the same default Lakehouse, Fabric can reuse the same Spark session across them → less overhead, faster job execution.
If notebooks have different default Lakehouse, Fabric creates a new Spark session for each notebook → more sessions, more overhead, longer total job time.

This creates a hidden dependency: to benefit from session reuse, you must standardize the same default Lakehouse across all notebooks. In real-world scenarios — especially when working with multiple Lakehouses or environments — this is often impractical.

By not attaching any default Lakehouse and using explicit ABFSS paths, you:

Remove the dependency on session reuse rules,
Avoid accidental session fragmentation,
Keep concurrency predictable and under your control.

#2. Easier Environment Promotion

When deploying notebooks across Dev, Test, and Prod environments, you don't need to worry about changing the default Lakehouse attachment. By using explicit ABFSS paths, you can parameterize Lakehouse IDs or names and switch environments without manual reconfiguration. This makes CI/CD pipelines cleaner and reduces deployment errors.

Trade-off: You'll write a few more lines of boilerplate and you'll lose some of the "browse tables from default" convenience in the notebook UI. For production jobs, explicitness is usually worth it.

#The pattern (dummy example you can use)

Works in a Fabric Notebook Spark session. If you have a shared code library, keep %run — otherwise remove it. Replace all DEMO* placeholders with your own values._

# ==========================================
# Demo: Spark SQL WITHOUT default Lakehouse
# ==========================================

# Identify your OneLake paths explicitly
workspace_id = "<YOUR-WORKSPACE-GUID>"
bronze_lakehouse_id = "<YOUR-BRONZE-LAKEHOUSE-ID>"
silver_lakehouse_id = "<YOUR-SILVER-LAKEHOUSE-ID>"

BRONZE_TABLES = f"abfss://{workspace_id}@onelake.dfs.fabric.microsoft.com/{bronze_lakehouse_id}/Tables/"
SILVER_TABLES = f"abfss://{workspace_id}@onelake.dfs.fabric.microsoft.com/{silver_lakehouse_id}/Tables/"

# Read from Bronze explicitly via ABFSS (Delta)
source_table = "DEMO_Bronze_customers"  # dummy table name
df_bronze = spark.read.format("delta").load(f"{BRONZE_TABLES}{source_table}")

# Create a TEMP VIEW for SQL (in-session only)
df_bronze.createOrReplaceTempView("vw_demo_bronze_customers")

# Use Spark SQL safely without default Lakehouse
# Transform: pick/rename columns, mock business logic
df_silver = spark.sql("""
SELECT
    CAST(customer_id AS STRING)              AS customerNumber,
    CONCAT(TRIM(first_name), ' ', TRIM(last_name)) AS customerName,
    UPPER(TRIM(city))                        AS customerLocation
FROM vw_demo_bronze_customers
WHERE active_flag = true
""")

# ---- 5) Write to a target Delta table explicitly (Silver) ----
target_table = "DEMO_Silver_customers"  # dummy target
(df_silver
    .write
    .format("delta")
    .mode("overwrite")                    # consider 'append' + MERGE for incremental cases
    .option("overwriteSchema", "true")
    .save(f"{SILVER_TABLES}{target_table}")
)

print("✅ Completed: Spark SQL without default Lakehouse")

Run Spark SQL in Fabric Notebooks Without Attaching a Default Lakehouse

Get future articles

#Introduction

#Why avoid a default Lakehouse?

#1. Concurrency and Spark Session Reuse in Fabric

#2. Easier Environment Promotion

#The pattern (dummy example you can use)

Related Posts

VACUUM vs CDF in Microsoft Fabric: How to Protect Incremental Loads from Silent Data Loss

Implementing Change Data Feed for Incremental Processing in Microsoft Fabric Lakehouses

Synapse Link for Dataverse: Support for Synapse Spark Version 3.4