Following the previous article about creating data tables in SQL, now we want to load data into our freshly created SQL table. In this article, I’ll show you three different import methods:
- When you want to add the data line by line.
- When you want to import the data from a .csv file.
- When you add rows to your new SQL table that are the results of another SQL query.
Note: This is going to be a practical tutorial, so I encourage you to do the coding part with me.
Note 2: If you are new here, let’s start with these SQL articles first:
- How to install Python, SQL, R and Bash (for non-devs)
- How to install SQL Workbench for postgreSQL
- SQL for Data Analysis – Tutorial for Beginners – ep 1
- How to create a table in SQL
Data import method #1: When you want to add the new data line by line
When we have only a few lines of data, very often the easiest way is to add them manually. We can do this by using the
Let’s get back to our
test_results table that we created in the previous tutorial.
Currently it’s an empty table… Let’s change it and add a line to it using the
INSERT INTO test_results VALUES ('Walt', 1, '1980-12-01', 95.50, 'A', TRUE);
Query our table!
SELECT * FROM test_results;
Oh, yeah! Walt’s test results are in the table. While this is a very manual process, you can speed it up if you
INSERT the rest of the students with one bigger SQL statement:
INSERT INTO test_results VALUES ('Jesse', 2, '1988-02-11', 74.00, 'C', TRUE), ('Todd', 3, '1987-06-13', 60.00, 'D', TRUE), ('Tuco', 4, '1970-11-11', 15.50, 'F', FALSE), ('Gus', 5, '1975-08-08', 80.00, 'B', TRUE) ;
Query the table once again!
SELECT * FROM test_results;
Now, we have 5 students. Easy as pie!
Just spend a few more seconds reviewing the syntax:
INSERT INTOis the SQL keyword.
test_resultsis the name of the table that we want to put the data into.
VALUESis another SQL keyword.
- Then the actual data rows are coming one by one – each of them between parentheses and separated with commas.
- The field values are separated with commas.
- Watch out for the Text and Date data types because these have to go between apostrophes!
- And don’t forget the semicolon at the end of the whole statement!
Commit your changes!
As we have discussed in the previous article, if you do changes in your database with an SQL manager tool (like pgadmin4 or SQL Workbench), you have to commit them. What does it mean? Learn more here.
But for now, let’s just run these one extra line in your SQL manager:
Note: If you turned auto-commit on or if you are in the command line and not in an SQL query tool, then you can skip this commit step.
How to empty your table without deleting the table?
You have already learned about the
DROP TABLE SQL statement that deletes your SQL table. But very often you don’t want to delete your table (because you want to keep the structure of it), only clear that data. You can do this by using the
TRUNCATE TABLE statement.
TRUNCATE TABLE test_results;
This will delete all the rows that we have inserted in the table before; but it will keep the table itself.
Don’t forget that you have to commit your changes!
Okay, if everything set, let’s see the…
Data import method #2: When you want to import data from a .csv file
To be honest, this is a more common scenario than the first method I showed. As a data analyst you quite regularly get raw data sets in file formats, like Excel or .csv or .txt. You can import these data files using the
The general format looks like this:
COPY table_name FROM '/path/step/file_name' DELIMITER ' ';
COPYis the SQL keyword.
table_nameis the name of the table that you want to put the data into. (This is not intuitive if you just look at the syntax.)
FROMis another SQL keyword.
- Then you have to specify the filename and the location of the file that you want to copy the data from between apostrophes.
- And eventually you have to specify the field separator in your original file by typing
DELIMITERand the field separator itself between apostrophes. So in this case
' 'means that the delimiter would be a space.
Let’s see the whole process with an example:
STEP 1) First, you have to open your Terminal window and connect to your data server.
(Note: At this point assume you know how to do it – if not: this way.)
STEP 2) Then type this (just copy-paste from here) into the command line:
echo "Walt,1,1980-12-01,95.50,A,TRUE Jesse,2,1988-02-11,74.00,C,TRUE Todd,3,1987-06-13,60.00,D,TRUE Tuco,4,1970-11-11,15.50,F,FALSE Gus,5,1975-08-08,80.00,B,TRUE" > test_results.csv
This will create a .csv file called
(In real life cases you will get this .csv file from someone at your company.)
STEP 3) Double-check your new file: cat test_results.csv.
And find out the exact location of it by typing pwd.
STEP 4) Then you have to login to postgresql (still in your Terminal window):
psql -U [your_sql_username] -d postgres
(For me it’s
psql -U dataguy -d postgres)
STEP 5) Then type the
COPY statement we just discussed:
\COPY test_results FROM '/home/dataguy/test_results.csv' DELIMITER ',';
And boom, the data is copied from a .csv file into our SQL table. You can even query it from your SQL tool.
A few comments on the .csv import method
- I typed
\COPYand not just
COPYbecause my SQL user doesn’t have SUPERUSER privileges, so technically I could not use the
COPYcommand (this is an SQL thing). Typing
\COPYinstead is the simplest workaround — but the best solution would be to give yourself SUPERUSER privileges then use the original
COPYcommand. (In this video starting at 2:55 I show how to give SUPERUSER privileges to your SQL user.)
- Why we didn’t do the
COPYin our SQL manager tool? Same reason: if you don’t have SUPERUSER privileges, you can’t run the
COPYcommand from an SQL tool, only from the command line. If you follow the video that I linked in the previous point, you will be able to run the same COPY statement from pgadmin or SQL Workbench.
'/home/dataguy/test_results.csv'is the location of the file and the name of the file, together. Again, we found out the location by using the
pwdcommand in the right folder.
- And finally: if you are not comfortable with these command line steps, read the first few articles from my Bash for Data Analytics article series.
And boom, the data is copied from a csv file into our SQL table.
Run this query from your SQL query tool:
SELECT * FROM test_results;
Data import method #3: When you want to import the result of another SQL query
Do you want to store the output of your SQL query? Not a problem… Maybe you want to save your daily KPIs that are calculated from SQL tables; or you want to have the cleaned version of a data set next to the original. In SQL, you can do that easily.
Say we want to create a table where we want to store only the names from our
Step 1) Create the table :
CREATE TABLE student_names ( name TEXT );
INSERT INTO statement (that we learned in the “Data import method #1” section, at the beginning of this article), only instead of typing the values manually, put a
SELECT statement at the end of the query.
Something like this:
INSERT INTO student_names (SELECT name FROM test_results);
This SQL statement put the result of the subquery between parentheses into the freshly created
Check the result:
SELECT * FROM student_names;
You can even combine this method with SQL functions. For instance, you can calculate the average of the test results and then save into a new table:
CREATE TABLE test_averages ( test_average DECIMAL );
INSERT INTO test_averages (SELECT AVG(test_result) FROM test_results);
This table will store only one value: the average test result… but if we also had math test results, biology test results and physics test results in other SQL tables, this
test_averages table would be the perfect place to collect the different averages.
This was the third – slightly more advanced – way to import your data into an SQL table. Now go ahead and test these methods on your own data set!
In this article we learned three methods to import data into SQL tables:
- When you want to insert your data manually. (
INSERT INTO ___ VALUES (____);)
- When you want to import your data from a file. (
COPY ____ FROM '_____' DELIMITER ' ';)
- When you want to store the output of another SQL query. (
INSERT INTO ____ (SELECT ____);)
Now, you know how to create new tables in SQL and how to import data into them!
- If you want to learn more about how to become a data scientist, take my 50-minute video course: How to Become a Data Scientist. (It’s free!)
- Also check out my 6-week online course: The Junior Data Scientist’s First Month video course.
- Or simply subscribe to my Newsletter.