Data Engineering Journey - no. 03
Table of contents:
- What have you learned?
- What were 2-3 interesting points?
- What were 2-3 points you didn’t understand?
- Where does this skill or best practice fit?
- Conclusion
1. What have you learned?
Below is a list of SQL queries I used while querying the database:
How manyt tables is in database:
Exercise Simple SQL Queries
Get all columns from the tables Customers, Orders and Suppliers:
Get all Customers alphabetically, by Country and name:
Get all orders by date:
Get the count of all Orders made during 1997
Get all orders by date:
Get the names of all the contact persons where the person is a manager, alphabetically
Create a report for all the orders of 1996 and their Custome
Create a report that shows the number of employees and customers from each city that has employees in it
Create a report that shows the number of employees and customers from each city that has customers in it
Create a report that shows the number of employees and customers from each city
Create a report that shows the order ids and the associated employee names for orders that shipped after the required date
Create a report that shows the total quantity of products (from the Order_Details table) rdered. Only show records for products for which the quantity ordered is fewer than 200
Create a report that shows the total number of orders by Customer since December 31, 1996. The report should only return rows for which the total number of orders is greater than 15
Find the 10 top selling products
Create a view with total revenues per customer
Which UK Customers have payed us more than 1000 dollars
How much has each customer payed in total and how much in 1997
2. What were 2-3 interesting points?
There are two useful kinds of JOIN-s that we can use to solve queries of medium difficulty. These are called Self JOIN
and CROSS JOIN
.
CROSS JOINS
tend to generate very large tables, so they are usually costly. However, they virtually allow us to iterate over two tables directly with SQL.
In general, we should try to avoid them, but it’s good to know they exist in case we need them.
- Use an inner join to combine rows from two tables based on a shared column, and you only want to include rows where the data in the join column matches in both tables.
- Use a full outer join to combine all rows from two tables, regardless of whether there is a match in the join column.
- Use a left outer join to combine all rows from the left table and matching rows from the right table.
- Use a right outer join to combine all rows from the right table and matching rows from the left table.
3. What were 2-3 points you didn’t understand?
I understood everything; for me, this is a review of the material and the basics to reinforce and refresh the information I already had before.
4. Where does this skill or best practice fit?
This skill and best practice fit into the domain of database management and data engineering, particularly in the following areas:
- Data Querying and Analysis: These practices are essential for retrieving, analyzing, and reporting data efficiently. Mastering SQL is a core competency for roles such as data analysts, data engineers, and database administrators.
- ETL Processes (Extract, Transform, Load): SQL is a crucial tool for handling transformations in ETL pipelines. Writing efficient queries ensures smooth data extraction and integration into the target systems.
- Database Optimization: Understanding and using different types of joins, filters, and aggregation functions is key to optimizing database performance and ensuring scalability.
- Reporting and Business Intelligence: SQL queries, such as those retrieving total revenues or identifying trends, form the backbone of dashboards and business intelligence reports.
- Data Engineering Projects: These practices are foundational for constructing data pipelines, ensuring data quality, and enabling downstream analytical tasks.
- Best Practices for Collaboration: Writing clear, optimized, and reusable SQL queries is a best practice for teams working together on complex data projects. In summary, these skills align with roles in data-driven decision-making, enabling efficient data workflows, and supporting broader business objectives
Conclusion:
The queries presented above are both the starting point and the fundamental queries that can be used for querying and performing operations on the database.
My site is free of ads and trackers. Was this post helpful to you? Why not
Reference:
Disqus is great for comments/feedback but I had no idea it came with these gaudy ads.