We’ve all been there. You inherit a single SQL query that spans 500 lines, looks like a tangled bowl of noodles, and takes five minutes to run. Your task? Just add one more column. It’s a nightmare.

Getting the right answer from the database is only half the battle. A truly great data professional writes queries that are not just correct, but also clean, fast, and understandable. This isn’t just about being tidy; it’s about saving yourself (and your teammates) hours of future frustration.

Let’s dive into the core habits that will transform your SQL from a messy script into a masterpiece of logic and efficiency.


Cleanliness is Next to Godliness: Formatting Your SQL

The fastest way to improve your queries is to make them readable. A well-formatted query reveals its logic at a glance.

Why it matters: When you revisit your code six months from now, you’ll thank yourself. When a colleague needs to debug your work, they’ll thank you too.

Here are some simple rules to live by:

  • Capitalize Keywords: Use UPPERCASE for SQL keywords like SELECT, FROM, JOIN, and WHERE. Use lowercase for your table and column names.
  • Use Strategic Indentation: Use new lines and indentation to separate logical blocks. Each JOIN gets its own line. The WHERE, GROUP BY, and ORDER BY clauses are all indented at the same level.
  • One Column Per Line: In your SELECT statement, list each column on its own line. This makes it easy to see what fields are being pulled and simplifies adding or removing columns later.

Before:

SQL

select u.id,u.name,p.order_date from users as u join purchases as p on u.id=p.user_id where u.signup_date > '2025-01-01' and p.amount > 1000;

After:

SQL

SELECT
    u.id,
    u.name,
    p.order_date
FROM
    users AS u
JOIN
    purchases AS p
    ON u.id = p.user_id
WHERE
    u.signup_date > '2025-01-01'
    AND p.amount > 1000; -- Assuming amount is in INR

The difference is night and day. The second query tells a clear story.


Taming Complexity with CTEs and Aliases

As your logic grows, it’s easy to fall into “subquery hell”—a confusing maze of nested SELECT statements. The hero that saves us from this is the Common Table Expression (CTE).

Think of CTEs as named, temporary tables that exist only for a single query. You define them at the beginning using a WITH clause. This allows you to break a complex problem into simple, logical steps.

Before: Subquery Hell

SQL

SELECT
    avg(orders_per_user)
FROM (
    SELECT
        user_id,
        count(order_id) AS orders_per_user
    FROM
        orders
    WHERE
        order_date BETWEEN '2025-07-01' AND '2025-07-31'
    GROUP BY
        user_id
) AS monthly_orders;

After: Clean and Logical with a CTE

SQL

WITH monthly_orders AS (
    SELECT
        user_id,
        count(order_id) AS orders_per_user
    FROM
        orders
    WHERE
        order_date BETWEEN '2025-07-01' AND '2025-07-31'
    GROUP BY
        user_id
)

SELECT
    avg(orders_per_user)
FROM
    monthly_orders;

The CTE version reads like a story: “First, create a temporary table of monthly orders, then find the average from that table.”

A quick note on aliasing: Always give your tables short, logical aliases (e.g., users AS u). This keeps your query tidy and easy to read, especially when joining multiple tables.


From Slow to Go: Making Your Queries Fast

A clean query is great, but a fast query is even better. Performance tuning can seem like a dark art, but two concepts will solve 80% of your performance problems: indexes and explain plans.

The Magic of Indexes

An index is a special lookup table that the database search engine can use to speed up data retrieval.

Analogy: Imagine trying to find a topic in a book without an index page. You’d have to scan every single page. An index lets you jump directly to the right page. That’s exactly what a database index does for your data.

Rule of Thumb: If you frequently filter (WHERE clause) or join (ON clause) on a specific column, it’s a prime candidate for an index. For example, user_id, email, or transaction_date are columns that almost always benefit from being indexed.

Understand the EXPLAIN Plan

How do you know if your query is using an index? You ask the database!

The EXPLAIN command (or EXPLAIN ANALYZE in PostgreSQL) is your window into the database’s mind. It shows you the execution plan—the step-by-step process the database will use to run your query.

When you run EXPLAIN on your query, look for one major red flag: a Full Table Scan. This means the database is reading every single row in a table, just like reading a book page-by-page. If you see this on a large table (millions of rows), it’s a sign that you are likely missing an index on the column in your WHERE clause.


Conclusion: It’s a Craft

Writing great SQL is a craft that develops with practice. By focusing on clean formatting, simplifying logic with CTEs, and boosting performance with indexes and EXPLAIN plans, you’re doing more than just getting the right answer.

You’re building systems that are robust, maintainable, and efficient. You’re becoming the data professional that everyone wants on their team.

Leave a Reply

Your email address will not be published. Required fields are marked *