Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Microsoft Fabric Lakehouse uses Delta Lake as the default table format for reliable, high-performance data storage and processing. While other formats are supported, Delta Lake provides the best integration across Fabric services.
What are Delta Lake tables?
When you store data in a Microsoft Fabric Lakehouse, data is stored as Delta Lake by default. Delta Lake adds capabilities that improve both performance and reliability:
- Better performance: Faster queries and data processing.
- Data reliability: Transactional consistency and integrity checks.
- Flexibility: Works with both structured data (for example, tables) and semi-structured data (for example, JSON).
Why does this matter?
Delta Lake is the standard table format for all data in Fabric Lakehouse. This means:
- Consistency: All data uses the same table format.
- Compatibility: Data works across Fabric tools such as Power BI, notebooks, and pipelines.
- No extra setup: When you load data into tables or use other data loading methods, Delta format is applied automatically.
Fabric handles Delta formatting behind the scenes, so you can focus on modeling and analysis.
Apache Spark engine and data formats
Fabric Lakehouse is powered by Apache Spark Runtime, which shares foundations with Azure Synapse Analytics Runtime for Apache Spark. Fabric also applies different defaults and optimizations for better out-of-the-box performance across Fabric workloads.
Supported data formats:
- Delta Lake: Preferred format with automatic optimization.
- CSV: Delimited text data.
- JSON: Semi-structured application and web data.
- Parquet: Compressed columnar files.
- Other formats: AVRO and legacy Hive table formats.
Key benefits of Fabric Spark defaults:
- Optimized by default: Performance features are automatically enabled for better speed
- Multiple formats supported: You can read from existing files in various formats
- Automatic conversion: When you load data into tables, it's automatically optimized using Delta Lake format
Note
While you can work with different file formats, tables displayed in the Lakehouse explorer are optimized Delta Lake tables for the best performance and reliability.
Differences from Azure Synapse Analytics
If you're migrating from Azure Synapse Analytics, here are the key configuration differences in Fabric's Apache Spark runtime:
For a broader comparison across Spark pools, configurations, libraries, notebooks, and Spark job definitions, see Compare Fabric Data Engineering and Azure Synapse Spark.
| Apache Spark configuration | Microsoft Fabric value | Azure Synapse Analytics value | Notes |
|---|---|---|---|
| spark.sql.sources.default | delta | parquet | Default table format |
| spark.sql.parquet.vorder.default | true | N/A | V-Order writer |
| spark.sql.parquet.vorder.dictionaryPageSize | 2 GB | N/A | Dictionary page size limit for V-Order |
| spark.databricks.delta.optimizeWrite.enabled | true | unset (false) | Optimize Write |
These optimizations are designed to provide better performance out-of-the-box in Fabric. Advanced users can modify these configurations if needed for specific scenarios.
How Fabric finds your tables automatically
When you open your Lakehouse, Fabric automatically scans your data and displays any tables it finds in the Tables section of the explorer. This means:
- No manual setup required - Fabric automatically discovers existing tables
- Organized view - Tables appear in a tree structure for easy navigation
- Works with shortcuts - Tables linked from other locations are also automatically discovered
This automatic discovery makes it easy to see all your available data at a glance.
Use shortcuts with tables and files
OneLake shortcuts can point to Delta tables or file and folder paths, so you can reference external data without moving it. The following table summarizes recommended patterns based on the target data type.
| Data type at shortcut target | Where to create the shortcut | Best practice |
|---|---|---|
| Delta Lake table | Tables section |
If multiple tables are present in the destination, create one shortcut per table. |
| Folders with files | Files section |
Use Apache Spark with relative paths to read directly from the shortcut target. Load into Lakehouse-native Delta tables for maximum performance. |
| Legacy Apache Hive tables | Files section |
Use Apache Spark with relative paths, or create a metadata catalog reference using CREATE EXTERNAL TABLE. Load into Lakehouse-native Delta tables for maximum performance. |
Load to tables
Microsoft Fabric Lakehouse provides a visual experience to load common file formats into Delta tables. To learn more, see Load to Delta Lake tables.
Keeping your tables fast and efficient
Fabric automatically optimizes your Delta Lake tables for better performance, but sometimes you may want additional control:
What Fabric does automatically:
- Combines small files into larger, more efficient files
- Optimizes data layout for faster queries
- Manages storage to reduce costs
When you might need manual optimization:
- Very large datasets with specific performance requirements
- Custom data organization needs
- Advanced analytics scenarios
For detailed guidance on table optimization, see Delta Lake table optimization and V-Order.