Without exception, all the online businesses I’ve worked with – or even just heard about – use SQL (Structured Query Language) in one way or another. Thus, learning it is very useful for anyone who wants to have a career in the online world. But for a Data Analyst (or Data Scientist) it’s a definite must, since SQL is one of the most common data languages, too. Let’s see why:
- SQL is easy to learn
- SQL performs extremely well on bigger data sets (even a few million lines)
- SQL has been around for more than 40 years and it will be here for at least the next 40
I put together an article series called SQL for Data Analysis – Tutorial for Beginners with 10 + 1 articles. By reading it, you can learn SQL in a very practical and fun way.
Need a free SQL Cheat Sheet first?
And here are the articles:
The first step is to set up your SQL environment. In this step-by-step tutorial article you can learn how to do that. We are going to use the best open-source SQL language: postgreSQL. But the tutorial itself will (optionally) walk you through how to install Python, R and bash too. Once you are done with everything, you will have your own SQL server to practice! READ>>
SQL Workbench is my favorite SQL manager tool. It’s fast, stable and runs on every operating system. SQL Workbench will boost your SQL learning curve because it makes your code easy to edit and visually transparent. The setup seems a bit difficult at first sight, so in this article I listed the 6 simple steps you should go through. By the end you will be all set! READ >>
The first article introduces the most basic SQL query, which you’ll use all the time. In this tutorial we will create our first SQL table, select data from it, and eventually filter for specific parameters! READ >>
4) SQL WHERE
The next chapter is dedicated to the SQL WHERE clause. It shows you the logical and comparison operators – and then you will see how to use the WHERE clause for sophisticated filtering. At the end of the article we will also import a 7.000.000+ row data set for further practicing. Plus, this is the first SQL for Data Analysis tutorial in which I included a video! 🙂 READ >>
Just like in Excel, you can use different functions in SQL. These include:
COUNT (for counting elements),
SUM (for summarizing integers),
AVG (for calculating mean),
MAX (for finding the greatest value),
MIN (for finding the smallest value), and so on… I guess I don’t have to emphasize how important these are.
This article also presents the sorting (
ORDER BY), the unique (
DISTINCT) and the grouping (
GROUP BY) “commands” of SQL. READ >>
At this point, you might already have reached an intermediate level in SQL for data analysis, so it’s worth it to take a break and learn a bit about SQL best practices. In this episode, I wrote down 6 things that will help you to make your future SQL efforts much smoother (and definitely much more syntax-error-free :-)). READ >>
7) SQL JOIN
SQL is super powerful for joining tables. Sometimes it does that even 1000-times faster than other languages. (I know this from my own experience.) So let’s just say, it’s worth spending some time to understand how you can use SQL JOIN like a pro. This article will describe the whole concept and give examples (with syntax) for the most common use cases. READ >>
In this episode, I listed three of the bit more advanced SQL methods. These are: sub-queries, SQL CASE (basically the if statement of SQL) and SQL HAVING (an advanced version of WHERE). While these are “advanced” (okay, maybe intermediate level — but definitely not beginner) SQL methods, sooner or later you will run into a problem where you have to use one or all of them! READ >>
As a data analyst, sometimes, you’ll have to create your own SQL tables. It has its own syntax. It’s not hard but you have to learn about it to be able using it efficiently. For that, you’ll also have to learn a bit about the different data types in SQL (like
BOOLEAN, and so on…) READ >>
Once you have created your SQL tables, you’ll have to put data into them. In this article, I’ll show you three different ways to do so using the
INSERT INTO and/or
COPY commands. READ >>
If you want to test your SQL skills with practical exercises – that are by the way quite typical in junior data scientist and analyst job interviews – check out this bonus episode: READ >>
A few more extra episodes in the Learn SQL for Data Analysis series:
- SQL current date (How to get the current date, time, month or year in postgreSQL?)
- SQL TRUNCATE table and DROP table (How to delete tables and data in SQL?)
Learning SQL for Data Analysis definitely pays off quickly: you will be much more efficient with your data-related tasks and you will get easier access to the business data you want to work with!
I hope these articles will help you to learn and understand everything! If you have any questions or feedback, let me know: just drop me an email.
- If you want to learn more about how to become a data scientist, take my 50-minute video course: How to Become a Data Scientist. (It’s free!)
- Also check out my 6-week online course: The Junior Data Scientist’s First Month video course.