SQL Interview Questions
SQL is one of the most common topics in data, backend, and analyst interviews. These are the questions interviewers actually ask, with concise answers you can speak confidently.
17 questions with concise, interview-ready answers.
1. What is SQL?
SQL (Structured Query Language) is the standard language for storing, retrieving, and manipulating data in relational databases. It lets you define schemas, insert and update rows, query data with conditions and joins, and control access. It is declarative — you describe what data you want, and the database engine decides how to fetch it.
2. What are DDL, DML, DCL, and TCL commands?
DDL (Data Definition Language) defines structure: CREATE, ALTER, DROP, TRUNCATE. DML (Data Manipulation Language) changes data: SELECT, INSERT, UPDATE, DELETE. DCL (Data Control Language) manages permissions: GRANT and REVOKE. TCL (Transaction Control Language) manages transactions: COMMIT, ROLLBACK, and SAVEPOINT.
3. What are the different types of JOINs in SQL?
INNER JOIN returns only rows that match in both tables. LEFT (OUTER) JOIN returns all rows from the left table plus matches from the right, with NULLs where there is no match; RIGHT JOIN is the mirror image. FULL OUTER JOIN returns all rows from both tables, matched where possible. CROSS JOIN returns the Cartesian product of both tables.
4. What is the difference between WHERE and HAVING?
WHERE filters individual rows before grouping and cannot use aggregate functions. HAVING filters groups after GROUP BY has been applied and is where you put conditions on aggregates, such as HAVING COUNT(*) > 5. You can use both in the same query — WHERE narrows the rows first, then HAVING filters the resulting groups.
5. What does GROUP BY do?
GROUP BY collapses rows that share the same values in the listed columns into a single summary row, so you can apply aggregate functions per group. For example, GROUP BY department lets you compute COUNT or AVG(salary) for each department. Every non-aggregated column in the SELECT must appear in the GROUP BY clause.
6. What is the difference between a primary key and a foreign key?
A primary key uniquely identifies each row in a table; it cannot be NULL and there is only one per table. A foreign key is a column (or set of columns) that references the primary key of another table, enforcing referential integrity between them. A foreign key can be NULL and can have duplicate values, unlike a primary key.
7. What is the difference between UNION and UNION ALL?
UNION combines the result sets of two queries and removes duplicate rows, which requires a sort or hash step and is therefore slower. UNION ALL combines them and keeps every row including duplicates, making it faster. Use UNION ALL when you know there are no duplicates or you want them preserved.
8. What is the difference between DELETE, TRUNCATE, and DROP?
DELETE is a DML command that removes rows one at a time based on an optional WHERE clause and can be rolled back. TRUNCATE is a DDL command that quickly removes all rows by deallocating data pages, cannot use a WHERE clause, and resets identity counters. DROP removes the entire table — both its data and its structure — from the database.
9. What are aggregate functions in SQL?
Aggregate functions compute a single value from a set of rows: COUNT counts rows, SUM totals a numeric column, AVG averages it, and MIN and MAX return the smallest and largest values. They are commonly used with GROUP BY to produce per-group summaries. Most aggregate functions ignore NULL values, except COUNT(*) which counts all rows.
10. What is the difference between a subquery and a join?
A join combines columns from multiple tables into one result set based on a matching condition, and is generally efficient and readable for relating data. A subquery is a query nested inside another query, often used to compute a value or a list for filtering, such as in a WHERE IN clause. Joins are usually faster for combining data, while subqueries can be clearer when you need an intermediate or aggregated result.
11. What is a correlated subquery?
A correlated subquery is an inner query that references a column from the outer query, so it is re-evaluated once for each row the outer query processes. For example, finding employees who earn more than their own department average uses the outer row to filter the inner aggregate. Because it runs per row, it can be slower than an equivalent join and should be used carefully on large tables.
12. What is an index and why is it used?
An index is a data structure (typically a B-tree) that lets the database find rows quickly without scanning the whole table, much like an index in a book. It dramatically speeds up SELECT queries and WHERE/JOIN lookups on the indexed columns. The trade-off is slower INSERT, UPDATE, and DELETE operations and extra storage, because the index must be kept in sync.
13. What is a view in SQL?
A view is a virtual table defined by a stored SELECT query; it does not hold data itself but presents the result of that query on demand. Views simplify complex queries, provide a consistent interface, and can restrict which columns or rows users see for security. A materialized view, by contrast, physically stores the result and must be refreshed to stay current.
14. What is normalization?
Normalization is the process of organizing tables to reduce data redundancy and avoid update anomalies by splitting data into related tables. It progresses through normal forms — 1NF removes repeating groups, 2NF removes partial dependencies on a composite key, and 3NF removes transitive dependencies. The goal is data integrity, though heavy normalization can require more joins, so analytics systems sometimes denormalize for read performance.
15. How do you find the second highest salary in a table?
A common approach is a subquery: SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees), which finds the highest salary below the maximum. A more flexible approach uses a window function such as DENSE_RANK() OVER (ORDER BY salary DESC) and filters for rank 2, which also generalizes to the Nth highest. The window-function method handles ties and arbitrary N more cleanly.
16. What is the difference between RANK, DENSE_RANK, and ROW_NUMBER?
ROW_NUMBER assigns a unique sequential number to every row with no ties. RANK gives tied rows the same rank but then skips the next numbers, so ranks can jump (1, 2, 2, 4). DENSE_RANK also gives ties the same rank but does not skip, leaving ranks consecutive (1, 2, 2, 3). All three are window functions that require an OVER clause with an ORDER BY.
17. What is the difference between a stored procedure and a function?
A stored procedure is a precompiled set of SQL statements that can perform actions, return zero or many result sets, use output parameters, and contain transactions; you call it with EXEC or CALL. A function must return a single value (or a table) and is designed to be used inside SQL expressions such as a SELECT or WHERE clause. Functions generally cannot modify database state or manage transactions, whereas procedures can.
Get these answered live in your real interview
NostrobeAI is a real-time AI interview copilot — it hears the question and drafts a strong answer on your screen, invisible on Zoom, Meet, and Teams. One-time pricing, no subscription.
Try NostrobeAI free