Blog – Daniel Crawford's Data Blog

Fabric – Easy Hub and Spoke Architecture

Posted on July 31, 2023June 18, 2024 by Daniel Crawford

If you have an existing hub and spoke architecture, moving your architecture to Fabric may not be as difficult as you might initially think. One of the coolest features in Fabric is shortcuts. Shortcuts allow you to access your data that is stored in delta format (in a data lake on ADLS, S3, or OneLake)…

Synapse Fundamentals for Tuning – Replicated Tables

Posted on January 20, 2023June 23, 2023 by Daniel Crawford

I covered some of the basics of table geometry in this post: Synapse Fundamentals for Performance Tuning – Distribution; however, it’s critical to dive deeper into the performance impact of having the correct table distribution defined. Sometimes we think we have the correct distribution chosen for a table but are painfully mistaken as performance suffers. Generally, that…

Synapse Fundamentals for Tuning – CPU Intensive Operations

Posted on December 9, 2022December 9, 2022 by Daniel Crawford

This performance consideration is not unique to Synapse but can be exasperated in Synapse due to the quantity of data that is being worked with. The CPU intensive operations that I am referring to come in three different flavors. These topics all have one thing in common. They are operations that could have been pushed…

Synapse Fundamentals for Tuning – Computed Columns

Posted on December 2, 2022June 18, 2024 by Daniel Crawford

So why are computed columns such a big performance consideration in Synapse Dedicated SQL Pools? There are two main reasons: Many times, computed columns are just a variation of the data that was originally stored in the source system. Instead of transforming the data as part of an ELT (Extract, Load, and Transform) process, a…

Synapse Fundamentals for Tuning – Nesting

Posted on November 23, 2022November 23, 2022 by Daniel Crawford

Calling stored procedures from other stored procedures or referencing a view within another view can introduce performance complexities that are difficult to troubleshoot. I will breakdown views and stored proc nesting issues individually, but the same message applies to both. Avoid heavily nested objects in Synapse if reliable performance is a priority. Understandably, we like…

Synapse Fundamentals – Table Creation Made Easy

Posted on September 27, 2022September 29, 2022 by Daniel Crawford

One of the most common issues encountered with Azure Synapse Dedicated SQL Pools is the confusion of table creation. The confusion stems from separating how a table is stored from where it is stored. I have already talked a little bit about this in previous posts (regarding distribution and indexing) but I want to dedicate…

Synapse Fundamentals for Tuning – Partitioning

Posted on September 23, 2022September 28, 2022 by Daniel Crawford

Next on my list of top performance killers in Synapse Dedicated SQL Pools, is Partitioning. Partitioning is too often overused in Synapse. Let’s first talk about when you would use partitioning and then how to use it effectively. Partitioning should only be applied on a table with a very large number of records and even…

Synapse Fundamentals for Tuning – Indexes Part2

Posted on September 14, 2022September 28, 2022 by Daniel Crawford

Rowstore Index Health For SQL Server, rowstore index fragmentation is a critical indicator for performance. In Synapse Dedicated SQL Pools, the same holds true but high fragmentation is generally less of a performance impact since tables are distributed. This however doesn’t mean that clustered/non-clustered indexes and heaps don’t have to be rebuilt. Due to the…

Synapse Fundamentals for Tuning – Indexes Part1

Posted on August 31, 2022September 28, 2022 by Daniel Crawford

If you have been around SQL Server for any length of time, you know by now that indexes are critical for performance. In Synapse Dedicated SQL Pools, indexes play a lesser role in query tuning because they do not impact the DSQL plans but rather impact the SQL plans on each of the distributions for…

Synapse Fundamentals for Performance Tuning – Distribution

Posted on August 12, 2022September 28, 2022 by Daniel Crawford

If you are coming from a SQL Server world, what are the fundamental concepts in Synapse that are essential to building a performant Dedicated SQL Pool? How does Synapse differ from regular SQL Server? Let’s start with the most basic foundational principle of distribution. You might hear the term “MPP” thrown around when we talk…