10 Common SQL Interview Questions – Built In

SQL (Structured Query Language) is a programming language used to manage data in relational databases. It is used to retrieve, manipulate, and analyze the data stored there. SQL allows users to easily filter, aggregate, and join data from multiple tables, making it a powerful tool for data exploration and analysis. 
SQL is an essential tool for data analysts, data scientists, data engineers, and business intelligence analysts. Additionally, many business intelligence and data visualization tools use SQL as their primary means of querying data, so proficiency in SQL is necessary for working with them effectively. 
If you’re interviewing for any of these roles, you’ll need to demonstrate your knowledge of SQL by writing queries or answering questions related to SQL syntax and database design. You will also be asked how to optimize the SQL query for better performance and design concepts such as normalization and primary keys. Interviewers are also interested in knowing your experience with specific SQL-related tools or database engines. So, this article will cover some of the most commonly asked SQL questions in interviews. As a note, all the queries listed in the examples below are written in Microsoft SQL Server.
More From Dhilip SubramanianAn Introduction to Apache Kafka
 
Here are some skills to make sure you brush up on ahead of your interview.
 
 
This operator combines the result set of two or more SELECT statements and returns only distinct rows. The number and the order of columns in the SELECT statements must be the same.
 
This operator combines the result set of two or more SELECT statements and returns all rows, including duplicates. The number and the order of columns in the SELECT statements must be the same.
 
This operator returns only the rows that are present in both result sets of the SELECT statements. The number and the order of columns in the SELECT statements must be the same.
 
This operator returns only the rows that are present in the first SELECT statement but not in the second one. The number and the order of columns in the SELECT statements must be the same.
 
This is an alternative name for MINUS operator.
A cross join, also known as a Cartesian product, is a type of join in SQL that returns the combination of every row from the first table with every row from the second table. The result is a table that contains the number of rows equal to the product of the number of rows in each of the joined tables.
The syntax for a cross join in SQL is:
In this example, the CROSS JOIN keyword is used to combine every row from table one with every row from table two. The result is a table that contains the number of rows equal to the product of the number of rows in table one and table two.
 
Table Name: Weather
 
To fetch alternate records from the above table, we can use the ROW_NUMBER() function to assign a unique number to each row in the table and then use the modulo operator (MOD) to filter out every nth row.
 
This will return all rows with an even row number. To fetch every third row, you would change the query to WHERE row_num % 3 = 0, and so on.
 
The RANK() and DENSE_RANK() functions in SQL are used to assign a unique rank to each row within a result set based on the values in one or more columns.
The RANK() function assigns a unique rank to each row within a result set based on the values in one or more columns. If two or more rows have the same values in the specified column(s), they will receive the same rank, and the next rank will be skipped. 
For example, if two rows have a rank of two, the next row will have a rank of four.
The DENSE_RANK() function also assigns a unique rank to each row within a result set based on the values in one or more columns. But in this case, if two or more rows have the same values in the specified column(s), they will receive the same rank, and the next rank will not be skipped. 
For example, if two rows have a rank of two, the next row will have a rank of three.
 
 
Both of these functions can be used along with the OVER() clause to specify the column(s) to base the ranking on and the sort order of the ranking.
 
A stored procedure is a pre-defined set of SQL commands that are saved and can be reused multiple times by a database management system. In simple terms, you can think of a stored procedure as a set of instructions that you can run on a database to accomplish a specific task.
The main reason why we use stored procedures is that they allow us to perform complex or repetitive tasks in a more efficient and organized way. Instead of writing and executing the same SQL commands over and over again, you can create a stored procedure once and then call it whenever you need to perform that specific task.
For example, let’s say you have a database that stores information about customer orders for an e-commerce website. You may need to frequently run a query that retrieves all the customer orders that were placed in the last 30 days. Instead of writing and running the same query every time you need that information, you can create a stored procedure that performs that query and then call it whenever you need the information.
 
This stored procedure is named getRecentOrders and it uses a SELECT statement to retrieve all columns from the orders table where the order_date is greater than or equal to 30 days ago (using the DATEADD function). The SET NOCOUNT ON statement is used to prevent the procedure from returning the number of rows affected by the SELECT statement.
We can execute this stored procedure by calling it using the EXEC command:
The syntax and structure of stored procedures may vary depending on the specific database management system.
Another benefit of stored procedures is that they can help to improve the security of your database. By using stored procedures, you can limit the access that users have to the underlying data tables, which can help to prevent unauthorized access and accidental data modification.
Stored procedures can also improve the performance of your database by reducing the amount of data that needs to be transferred between the database and the application. Stored procedures can also be used to minimize the number of database trips.
More in Data ScienceUnderstanding Ordinary Least Squares (OLS) Regression
 
Table: Orders
We can find duplicate records is by using GROUP BY and HAVING clause
 
This query uses the GROUP BY clause to group the results by the order_id, customer_id, and order_date columns, and then uses the COUNT() function to return the number of rows in each group. The HAVING clause is used to filter the results to only include groups where the count is greater than one, which will return all the records that have duplicate values in the selected columns.
 
Without an index in an SQL table, searching for specific data within the table would be much slower. This is because the database management system (DBMS) would have to scan every row in the table to find the relevant data, rather than being able to quickly locate it using an index. This process is known as a full-table scan.
When a table is small, the difference in performance may not be significant. As the table grows in size, however, the difference in performance can become substantial. This can lead to slow query performance and may even cause the system to become unresponsive if the table is large enough.
Indexes speed up data retrieval by allowing the DBMS to quickly locate the data it needs. Each index is a separate data structure that stores a copy of the data from one or more columns in the table, along with a pointer to the location of the corresponding data in the table. This allows the DBMS to look up the data it needs in the index rather than having to scan the entire table.
Table: Customer (Reference: Leetcode)
Table: Product
Write an SQL query to report the customer IDs from the Customer table that bought all the products in the Product table.
 
 
We can solve the above problem by using a combination of the INNER JOIN, GROUP BYHAVING and COUNT aggregate functions.
The above query first joins the Customer table with the Product table on the product_key column. Then it groups the results by customer_id and counts the number of distinct product_keys that each customer bought.
Then it selects the customer_id from the joined table where the product_count is equal to the total number of products in the Product table. The COUNT(DISTINCT Product.product_key) ensures that a customer is only counted once for each unique product that they bought.
The query then uses a HAVING clause to filter out only those customers whose product_count matches the total number of products in the Product table.
 
In SQL, a table is a collection of data that is stored in a specific format, with rows and columns. Tables are used to store data that can be queried, updated, and deleted. For example, an Employees table might have columns for ID, name, email, and salary, and each row in the table would represent a different employee.
A view, on the other hand, is a virtual table that is based on the result of an SQL query. A view does not store data itself, but instead references data from one or more tables and presents it in a specific format. For example, a Top_Salaries view might be created to only show the employees with the highest salaries from the Employees table. The view would not store any data itself, but would instead reference the data from the Employees table and present it in the desired format.
Another important difference between views and tables is that views can be read-only, meaning that you cannot insert, update, or delete data from a view. Tables, on the other hand, can be updated and deleted.
In simple terms, a table is where the data is stored, and a view is a virtual table that references the data from one or more tables, allowing you to see the data in a specific format without changing the original data.
More on SQLWhy SQLZoo Is the Best Way to Practice SQL
 
Table: Employee
This query first uses a Common Table Expression (CTE) to select the top three distinct salaries from the Employee table, ordered by salary in descending order. The CTE is then used to select the minimum salary from that list of top three salaries, which is the third highest salary in the table. Here, the third highest salary is $25,000.
Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

source

Leave a Comment