SQL Interview Questions: Mid Level (Q61–Q130)

About This Guide

Mid-level SQL interviews go beyond writing queries. They test whether you understand why queries perform the way they do — indexes, execution plans, locking, transactions, and query optimisation patterns.

Window Functions

Q61: What is a window function?

A function that performs a calculation across a set of rows related to the current row — without collapsing them into a group. The rows remain visible in the result.

SQL

-- Rank customers by total spend — rows stay individual
SELECT
    CustomerId,
    Name,
    TotalSpend,
    RANK() OVER (ORDER BY TotalSpend DESC) AS SpendRank
FROM CustomerSummary;

Q62: What is the difference between RANK, DENSE_RANK, and ROW_NUMBER?

SQL

-- Sample: scores 100, 100, 90, 80
SELECT Score,
    ROW_NUMBER()  OVER (ORDER BY Score DESC) AS RowNum,     -- 1, 2, 3, 4 (no ties)
    RANK()        OVER (ORDER BY Score DESC) AS Rnk,        -- 1, 1, 3, 4 (gap after tie)
    DENSE_RANK()  OVER (ORDER BY Score DESC) AS DenseRnk    -- 1, 1, 2, 3 (no gap)
FROM Scores;

| | 100 | 100 | 90 | 80 | |---|---|---|---|---| | ROW_NUMBER | 1 | 2 | 3 | 4 | | RANK | 1 | 1 | 3 | 4 | | DENSE_RANK | 1 | 1 | 2 | 3 |

Q63: What is PARTITION BY in a window function?

Divides the window into groups. The function resets for each partition — like GROUP BY but without collapsing rows.

SQL

-- Rank customers by spend WITHIN each country
SELECT
    CustomerId, Country, TotalSpend,
    RANK() OVER (PARTITION BY Country ORDER BY TotalSpend DESC) AS CountryRank
FROM CustomerSummary;
-- Rank restarts at 1 for each country

Q64: What are LAG and LEAD?

Access values from previous (LAG) or next (LEAD) rows in the window.

SQL

-- Month-over-month comparison
SELECT
    Month,
    Revenue,
    LAG(Revenue, 1, 0)  OVER (ORDER BY Month) AS PrevMonthRevenue,
    Revenue - LAG(Revenue, 1, 0) OVER (ORDER BY Month) AS MonthlyChange
FROM MonthlyRevenue;

Q65: What are FIRST_VALUE and LAST_VALUE?

Return the first or last value in the window frame.

SQL

-- Compare each employee's salary to the highest in their department
SELECT
    Name, Department, Salary,
    FIRST_VALUE(Salary) OVER (
        PARTITION BY Department
        ORDER BY Salary DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS DeptMaxSalary
FROM Employees;

Note: LAST_VALUE requires ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to see the full partition.

Q66: What is a running total using window functions?

SQL

SELECT
    OrderDate,
    Amount,
    SUM(Amount) OVER (ORDER BY OrderDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
FROM Orders;

Q67: How do you get the Nth row per group?

SQL

-- Top 3 orders per customer
WITH Ranked AS (
    SELECT
        CustomerId, Id, Amount,
        ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY Amount DESC) AS Rn
    FROM Orders
)
SELECT CustomerId, Id, Amount
FROM Ranked
WHERE Rn <= 3;

Q68: What is NTILE?

Divides rows into N buckets (quartiles, deciles, etc.).

SQL

-- Divide customers into 4 spend quartiles
SELECT
    CustomerId, TotalSpend,
    NTILE(4) OVER (ORDER BY TotalSpend) AS SpendQuartile
FROM CustomerSummary;
-- 1 = lowest, 4 = highest

Indexes

Q69: What is an index and what are the tradeoffs?

An index is a sorted data structure (usually a B-tree) that allows fast lookups on a column. Benefits: fast reads. Costs: slower writes (index must be updated), more disk space, more memory.

Q70: What is a clustered index?

The table rows are physically stored in the order of the clustered index key. Only one per table. The primary key is clustered by default in SQL Server.

Fast for: range scans on the key, lookups by key.

SQL

-- The table is physically sorted by CustomerId
CREATE CLUSTERED INDEX IX_Orders_CustomerId ON Orders (CustomerId);

Q71: What is a non-clustered index?

A separate structure containing the indexed column(s) and a pointer to the actual row. Multiple per table. Read path: find the index entry → follow the pointer to the row (key lookup).

Q72: What is a covering index?

A non-clustered index that includes all columns needed by a query — so SQL Server never needs to look up the actual row.

SQL

-- Query: SELECT Name, Email FROM Customers WHERE Country = 'UK'
-- Covering index includes all three columns
CREATE INDEX IX_Customers_Country ON Customers (Country) INCLUDE (Name, Email);
-- SQL Server reads only the index — no row lookup needed

Q73: What is a composite index and how does column order matter?

An index on multiple columns. Column order matters because the index is sorted by the first column, then the second, etc. A query can use the index only if it filters on a leading prefix of the index columns.

SQL

CREATE INDEX IX_Orders_CustomerStatus ON Orders (CustomerId, Status);

-- Uses index — filters on leading column
SELECT * FROM Orders WHERE CustomerId = 5;

-- Uses index — filters on both
SELECT * FROM Orders WHERE CustomerId = 5 AND Status = 'Pending';

-- CANNOT efficiently use index — skips leading column
SELECT * FROM Orders WHERE Status = 'Pending';

Q74: What is an index seek vs an index scan?

Seek: SQL Server uses the B-tree to jump directly to matching rows. Fast. Triggered by a selective WHERE clause on an indexed column.

Scan: SQL Server reads every row in the index or table. Triggered when: no suitable index, the predicate isn't on an indexed column, or the query isn't selective enough.

Q75: What causes index fragmentation and how do you fix it?

Fragmentation occurs when pages become out of order from inserts/deletes. Queries that should be sequential I/O become random I/O.

SQL

-- Check fragmentation
SELECT index_id, avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Orders'), NULL, NULL, 'SAMPLED');

-- Fix < 30% fragmentation — reorganise (online, less resource)
ALTER INDEX IX_Orders_CustomerId ON Orders REORGANIZE;

-- Fix >= 30% fragmentation — rebuild (briefly locks, more thorough)
ALTER INDEX IX_Orders_CustomerId ON Orders REBUILD;

Q76: What is a filtered index?

An index with a WHERE clause — indexes only a subset of rows. Smaller, faster for queries that target that subset.

SQL

-- Only index active customers (most queries filter on IsActive = 1)
CREATE INDEX IX_Customers_Active ON Customers (Email)
WHERE IsActive = 1;

Execution Plans

Q77: What is an execution plan?

SQL Server's plan for executing a query — which indexes to use, how to join tables, estimated row counts. Reading execution plans is the primary tool for query optimisation.

SQL

-- View estimated plan without executing
SET SHOWPLAN_ALL ON;
SELECT * FROM Orders WHERE CustomerId = 5;

-- Or press Ctrl+L in SSMS / click "Include Actual Execution Plan"

Q78: What are the key operators in an execution plan?

| Operator | Meaning | |---|---| | Index Seek | Uses B-tree to find rows — fast | | Index Scan | Reads entire index — potentially slow | | Table Scan | Reads entire table — slow, no useful index | | Key Lookup | Fetches row from clustered index after non-clustered seek — consider covering index | | Hash Match | Join using hash table — memory-intensive | | Nested Loops | Join: for each row in outer, scan inner — fast with small outer set | | Merge Join | Join on sorted inputs — fast for large sorted sets | | Sort | Explicit sort — consider if index can provide order |

Q79: What does a "thick arrow" (wide edge) in an execution plan mean?

Wide arrows indicate many rows being passed between operators. If a thick arrow feeds into an expensive operation, that's the bottleneck — reduce the row count earlier with better filtering or indexes.

Q80: What is a key lookup and how do you eliminate it?

A key lookup happens when SQL Server uses a non-clustered index to find rows, then must go back to the clustered index to fetch additional columns. Expensive for many rows.

Fix: add the needed columns to the non-clustered index with INCLUDE.

Transactions and Locking

Q81: What is a transaction?

A unit of work where all statements succeed or all fail together. ACID properties: Atomicity, Consistency, Isolation, Durability.

SQL

BEGIN TRANSACTION;

    UPDATE Accounts SET Balance = Balance - 100 WHERE Id = 1;
    UPDATE Accounts SET Balance = Balance + 100 WHERE Id = 2;

    IF @@ERROR <> 0
        ROLLBACK TRANSACTION;
    ELSE
        COMMIT TRANSACTION;

Q82: What is COMMIT and ROLLBACK?

COMMIT makes all changes in the transaction permanent. ROLLBACK undoes all changes since the transaction began.

Q83: What is a deadlock?

Two transactions each hold a lock the other needs. Neither can proceed. SQL Server detects deadlocks and kills one (the "deadlock victim"), which receives error 1205.

Transaction A: locks Row 1, waits for Row 2
Transaction B: locks Row 2, waits for Row 1
→ Deadlock

Prevention: access tables/rows in consistent order across transactions. Keep transactions short.

Q84: What are the SQL Server isolation levels?

| Level | Dirty Read | Non-Repeatable Read | Phantom Read | |---|---|---|---| | READ UNCOMMITTED | Yes | Yes | Yes | | READ COMMITTED (default) | No | Yes | Yes | | REPEATABLE READ | No | No | Yes | | SERIALIZABLE | No | No | No | | SNAPSHOT | No | No | No |

SQL

SET TRANSACTION ISOLATION LEVEL READ COMMITTED; -- SQL Server default
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;       -- MVCC-based, reduces blocking

Q85: What is NOLOCK and when is it dangerous?

NOLOCK = READ UNCOMMITTED — reads data that hasn't been committed. Can return "dirty" data (rows from rolled-back transactions), ghost rows, or miss rows.

Use only for approximate reporting queries where occasional stale data is acceptable. Never for financial data.

Q86: What is an optimistic vs pessimistic lock?

Pessimistic: locks data before reading to prevent others from modifying it. Blocking. Traditional SQL lock behaviour.

Optimistic: reads without locking, then checks on update if data was changed since the read. Uses ROWVERSION or TIMESTAMP columns. Less contention, but requires retry logic on conflict.

Query Optimisation

Q87: What causes a query to be slow?

Common causes:

Table scan on large table (missing index)
Key lookup (add INCLUDE columns to index)
Parameter sniffing (plan cached for bad parameter values)
Implicit type conversion (prevents index use)
Functions on indexed columns in WHERE
Large sort operation (no index providing order)
Excessive rows returned (missing selective filter)

Q88: Why do functions on columns in WHERE prevent index use?

SQL

-- SLOW — function applied to every row, index can't be used
SELECT * FROM Orders WHERE YEAR(CreatedAt) = 2025;

-- FAST — range on the column, index seek is possible
SELECT * FROM Orders WHERE CreatedAt >= '2025-01-01' AND CreatedAt < '2026-01-01';

Q89: What is parameter sniffing?

SQL Server compiles a query plan when first executed, using the parameter values at that moment. If those values lead to a plan that's bad for future calls with different values, performance degrades.

SQL

-- Fix option 1: OPTION (RECOMPILE) — recompile each execution
SELECT * FROM Orders WHERE CustomerId = @id OPTION (RECOMPILE);

-- Fix option 2: OPTIMIZE FOR — hint a "typical" value
SELECT * FROM Orders WHERE CustomerId = @id OPTION (OPTIMIZE FOR (@id = 1000));

Q90: What is the difference between WHERE and JOIN ON for filtering?

For INNER JOINs, they're equivalent — move filter wherever readable. For OUTER JOINs, they differ:

SQL

-- LEFT JOIN with filter in ON — includes all customers, only matches active orders
SELECT c.Name, o.Amount
FROM Customers c
LEFT JOIN Orders o ON c.Id = o.CustomerId AND o.Status = 'Active';

-- LEFT JOIN with filter in WHERE — only customers WITH an active order (converts to INNER JOIN)
SELECT c.Name, o.Amount
FROM Customers c
LEFT JOIN Orders o ON c.Id = o.CustomerId
WHERE o.Status = 'Active';  -- filters out NULLs, excluding customers with no orders

Normalisation

Q91: What is database normalisation?

Organising a database to reduce redundancy and improve data integrity by following normal forms.

Q92: What is First Normal Form (1NF)?

Each cell contains a single atomic value. No repeating groups.

SQL

-- VIOLATES 1NF — multiple values in one column
CustomerPhones: "555-1234, 555-5678"

-- 1NF — each value in its own row
CustomerPhone (CustomerId, Phone, PhoneType)

Q93: What is Second Normal Form (2NF)?

In 1NF, and every non-key column depends on the entire primary key (no partial dependencies). Only matters for composite keys.

Q94: What is Third Normal Form (3NF)?

In 2NF, and no non-key column depends on another non-key column (no transitive dependencies).

SQL

-- VIOLATES 3NF — PostCode determines City, but PostCode is not the key
Orders (OrderId, CustomerId, PostCode, City)

-- 3NF — extract the dependency
Postcodes (PostCode, City)
Orders (OrderId, CustomerId, PostCode) -- PostCode is FK to Postcodes

Q95: When would you intentionally denormalise?

When read performance is critical and the cost of joins outweighs the risk of redundancy. Common in reporting databases, OLAP, and read replicas. Always document intentional denormalisation.

Advanced Query Patterns

Q96: How do you do an upsert (insert or update) in SQL Server?

SQL

MERGE INTO Products AS target
USING (SELECT @Id AS Id, @Name AS Name, @Price AS Price) AS source
    ON target.Id = source.Id
WHEN MATCHED THEN
    UPDATE SET Name = source.Name, Price = source.Price
WHEN NOT MATCHED THEN
    INSERT (Id, Name, Price) VALUES (source.Id, source.Name, source.Price);

Q97: How do you delete duplicates keeping one row?

SQL

-- Keep the row with the lowest Id, delete others
WITH Dupes AS (
    SELECT Id,
        ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Id) AS Rn
    FROM Customers
)
DELETE FROM Dupes WHERE Rn > 1;

Q98: How do you pivot rows to columns?

SQL

-- Sales per product per quarter
SELECT ProductId, [Q1], [Q2], [Q3], [Q4]
FROM Sales
PIVOT (
    SUM(Amount)
    FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS PivotTable;

Q99: What is STRING_AGG?

Concatenates values from multiple rows into a single string.

SQL

-- List all order IDs per customer in one row
SELECT
    CustomerId,
    STRING_AGG(CAST(Id AS VARCHAR), ', ') WITHIN GROUP (ORDER BY Id) AS OrderIds
FROM Orders
GROUP BY CustomerId;

Q100: What is CROSS APPLY vs OUTER APPLY?

Like INNER JOIN and LEFT JOIN but for table-valued functions. CROSS APPLY returns rows only when the function returns results. OUTER APPLY returns the outer row even when the function returns nothing.

SQL

-- Top 3 orders per customer using CROSS APPLY
SELECT c.Name, o.Amount
FROM Customers c
CROSS APPLY (
    SELECT TOP 3 Amount FROM Orders
    WHERE CustomerId = c.Id
    ORDER BY Amount DESC
) o;

Additional Mid-Level Questions (Q101–Q130)

Q101: What is @@ROWCOUNT and OUTPUT? @@ROWCOUNT gives rows affected by the last statement. OUTPUT captures inserted/deleted/updated rows: DELETE FROM Orders OUTPUT deleted.* WHERE Status = 'Cancelled'.

Q102: What is a view and can it always be updated? A view is a saved query. Simple views (single table, no aggregation, no DISTINCT, no GROUP BY) can be updated. Complex views cannot without INSTEAD OF triggers.

Q103: What is an indexed (materialised) view? A view with a clustered index — data is physically stored and kept in sync with base tables automatically. Fast reads, overhead on writes. Requires strict conditions (no outer joins, no subqueries, etc.).

Q104: What is SCHEMABINDING on a view? Prevents underlying tables from being altered in ways that would break the view. Required for indexed views.

Q105: What is a trigger? Code that runs automatically on INSERT/UPDATE/DELETE. Uses inserted and deleted virtual tables. Use sparingly — hard to debug, can cause unexpected cascades.

Q106: What is SET NOCOUNT ON? Suppresses "N rows affected" messages. Reduces network traffic in stored procedures and prevents ORMs from misinterpreting the row count.

Q107: What is TRY...CATCH in SQL? Error handling block. Catches runtime errors (not compile errors). Use ERROR_MESSAGE(), ERROR_NUMBER() inside CATCH to log the error.

Q108: What is XACT_ABORT? SET XACT_ABORT ON — if any statement in a transaction fails, the entire transaction is automatically rolled back. Best practice for stored procedures with transactions.

Q109: What is the difference between DATETIME and DATETIME2? DATETIME2 has higher precision (100ns vs 3.33ms), wider range, and standard format. Prefer DATETIME2 for new tables.

Q110: What is DATETIMEOFFSET? Stores datetime with timezone offset. Useful for globally distributed applications where you need to preserve the original timezone.

Q111: What is JSON_VALUE and OPENJSON? SQL Server 2016+ functions for querying JSON stored in VARCHAR/NVARCHAR columns. JSON_VALUE(col, '$.name') extracts a scalar. OPENJSON(col) returns rows.

Q112: What is a sparse column? Optimised storage for columns that are mostly NULL. Reduces storage for NULLs but adds overhead when values are present. Useful for wide tables with many optional attributes.

Q113: What is page compression? SQL Server compresses pages to reduce I/O. Row compression stores values in minimum bytes. Page compression adds dictionary compression per page. Trades CPU for I/O — effective on read-heavy, compressible data.

Q114: What is a statistics object? Metadata SQL Server maintains about the distribution of values in indexed columns. Used by the query optimiser to estimate row counts. Stale statistics cause bad plans. Update with UPDATE STATISTICS.

Q115: What is OPTION (MAXDOP n)? Query hint limiting degree of parallelism — number of CPUs used for a single query. MAXDOP 1 forces serial execution, useful when parallelism overhead exceeds the benefit.

Q116: What is WITH (NOLOCK) table hint vs SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED? NOLOCK applies to one table. SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED applies to the entire session. Both allow dirty reads — identical in effect for that table/session.

Q117: What is a heap? A table with no clustered index. Rows are stored in insertion order. Fast for bulk inserts but slow for range scans. Most tables should have a clustered index.

Q118: What is IDENTITY vs SEQUENCE? IDENTITY is a table-column property for auto-incrementing. SEQUENCE is an independent object that can be shared across tables, pre-allocated in bulk, and reset. Use SEQUENCE when you need the value before INSERT.

Q119: What is GUID/UNIQUEIDENTIFIER as a primary key and its downsides? GUIDs are globally unique — good for distributed systems, merge replication. Downsides: 16 bytes vs 4 for INT, random values cause index fragmentation (use NEWSEQUENTIALID() to mitigate).

Q120: What is READ_COMMITTED_SNAPSHOT (RCSI)? A database-level setting that changes READ COMMITTED to use row versioning (MVCC) instead of shared locks. Readers don't block writers; writers don't block readers. Recommended for OLTP databases.

Q121: What is the difference between TRUNCATE and DELETE in terms of transactions? Both can be rolled back if inside an explicit transaction. TRUNCATE is minimally logged (faster). DELETE is fully logged (slower, but each row is logged individually). TRUNCATE resets IDENTITY; DELETE doesn't.

Q122: What are fill factors in indexes? The percentage of each index page to fill during creation. FILLFACTOR = 80 leaves 20% free, reducing page splits on inserts. Lower fill factor = more space used but fewer splits.

Q123: What is a missing index hint? SQL Server's execution plan shows "Missing Index" suggestions when it estimates an index would improve query cost by more than a threshold. Suggestions, not mandates — evaluate before blindly creating them.

Q124: How do you find the most expensive queries?

SQL

SELECT TOP 10
    total_logical_reads / execution_count AS avg_logical_reads,
    execution_count, total_elapsed_time / execution_count AS avg_duration_us,
    SUBSTRING(st.text, (qs.statement_start_offset/2)+1, 200) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY avg_logical_reads DESC;

Q125: What is sys.dm_exec_requests? A DMV showing currently executing requests. Use it to find blocking queries, long-running operations, and their wait types.

Q126: What is a wait type? What SQL Server is waiting for when a query is slow. PAGEIOLATCH_SH = I/O wait (add indexes). LCK_M_X = lock wait (blocking). CXPACKET = parallelism. Identify with sys.dm_exec_requests.

Q127: What is OPTIMIZE FOR UNKNOWN? Query hint telling SQL Server to compile the plan without sniffing the parameter value — uses average statistics instead. Prevents sniffing a plan optimised for a rare parameter value.

Q128: What is an implicit conversion and why is it bad? When SQL Server converts a value's type to match a column's type before comparing. Prevents index use. WHERE IntColumn = '5' — string '5' is converted, every row is read. Always use matching types.

Q129: What is INCLUDE vs adding columns to the key of an index? Columns in the key are part of the B-tree sort order — useful for filtering/ordering. INCLUDE columns are stored at the leaf level only — useful only to cover a SELECT, not for filtering. INCLUDE columns don't increase index key size, which keeps the tree shallow.

Q130: What is the difference between EXISTS and IN for performance? For correlated subqueries, EXISTS short-circuits at the first match — often faster. IN with a subquery materialises the full subquery result before comparing. For large result sets or correlated scenarios, prefer EXISTS. For small literal lists, IN is fine.