In my Python workshops and online courses I see that one of the trickiest things for newcomers is the syntax itself. It’s very strict and many things might seem inconsistent at first. In this article I’ve collected the Python syntax essentials you should keep in mind as a data professional — and I added some formatting best practices as well, to help you keep your code nice and clean.
These are the basics. If you want to go deep down the rabbit hole, I’ll link to some advanced Python syntax and formatting tutorials at the end of this article!
This article is the part of my Python for Data Science article series. If you haven’t done so yet, please start with these articles first:
- How to install Python, R, SQL and bash to practice data science!
- Python for Data Science #1 – Tutorial for Beginners – Python Basics
The 3 major things to keep in mind about Python syntax
#1 Line Breaks Matter
Unlike in SQL, in Python, line breaks matter. Which means that in 99% of cases, if you put a line break where you shouldn’t put one, you will get an error message. Is it weird? Hey, at least you don’t have to add semicolons at the end of every line.
So here’s Python syntax rule #1: one statement per line.
There are some exceptions, though. Expressions
- in parentheses (eg. functions and methods),
- in bracket frames (eg. lists),
- and in curly braces (eg. directories)
can actually be split into more lines. This is called implicit line joining and it is a great help when working with bigger data structures.
my_movies = ['How I Met Your Mother', 'Friends', 'Silicon Valley', 'Family Guy', 'South Park']
Additionally, you can also break any expression into more than one line if you use a backslash (
\) at the end of the line. And you can do the opposite, too: inserting more than one statement into one line using semicolons (
;) between the statements. However, these two methods are not too common, and I’d recommend using them only when necessary. (E.g. with really long, 80+ character long statements.)
#2 Indentations Matter
Do you hate indentations? Well, you are not alone. Many people who are just starting off with Python dislike the concept. For non-programmers it is unusual and on the top of that it causes the most errors in their scripts at the beginning. As for me, I love indentations and I promise that you will get used to them, too. Well, if you’ve worked your way through my Python articles so far, I’m pretty sure that you already have.
So we have Python syntax rule #2: make sure that you use indentations correctly and consistently.
Note: I talked about the exact syntax rules governing for loops and if statements in the relevant articles.
One more thing: if you watch the Silicon Valley TV show, you might have heard about the debate of “tabs vs spaces.” Here’s the hilarious scene:
So tabs or spaces? Here’s what the original Style Guide for Python Code says:
Pretty straight forward!
ps. To be honest, in Jupyter Notebook, I use tabs.
#3 Case Sensitivity
Python is case sensitive. It makes a difference whether you type
and (correct) or
AND (won’t work). As a rule of thumb, learn that most of the Python keywords have to be written with lowercase letters. The most commonly used exceptions I have to mention here (because I see many beginners have trouble with it) are the Boolean values. These are correctly spelled as:
There’s Python syntax rule #3: Python is case sensitive.
Other Python Best Practices for Nicer Formatting
Let me just list a few (non-mandatory but highly recommended) Python best practices that will make your code much nicer, more readable and more reusable.
Python Best Practice #1: Use Comments
You can add comments to your Python code. Simply use the
# character. Everything that comes after the
# won’t be executed.
# This is a comment before my for loop. for i in range(0, 100, 2): print(i)
Python Best Practice #2: Variable Names
Conventionally, variable names should be written with lower letters, and the words in them separated by
_ characters. Also, generally I do not recommend using one letter variable names in your code. Using meaningful and easy-to-distinguish variable names helps other programmers a lot when they want to understand your code.
my_meaningful_variable = 100
Python Best Practice #3: Use blank lines
If you want to separate code blocks visually (e.g. when you have a 100 line Python script in which you have 10-12 blocks that belong together) you can use blank lines. Even multiple blank lines. It won’t affect the result of your script.
Python Best Practice #4: Use white spaces around operators and assignments
For cleaner code it’s worth using spaces around your
= signs and your mathematical and comparison operators (
-, etc.). If you don’t use white spaces, your code will run anyway, but again: the cleaner the code, the easier to read it, the easier to reuse it.
number_x = 10 number_y = 100 number_mult = number_x * number_y
Python Best Practice #5: Max line length should be 79 characters
If you reach 79 characters in a line, it’s recommended to break your code into more lines. Use the above-mentioned
\ character. Using the
\ at the end of the line, Python will ignore the line break and will read your code as if it were one line.
(Or in some cases you can take advantage of implicit line joining.)
Python Best Practice #6: Stay consistent
And one of the most important rules: always stay consistent! Even if you follow the above rules, in specific situations you’ll have to create your own. Either way: make sure you are using these rules consistently. Ideally, you have to create Python scripts that you can open 6 months later without any trouble understanding them. If you randomly change your formatting rules and naming conventions, you’ll create an unnecessary headache for your future self. So stay consistent!
The Zen of Python – a nice easter egg
What else could come at the end of this article but a nice Python easter egg.
If you type
import this to your Jupyter Notebook you will get the 19 design “commandments” of Python:
>>> import this The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!
Use these advices wisely! 😉
Well this is it. Follow this advice, and if you want to learn more about Python syntax essentials and best practices, I recommend these articles:
Now go ahead and check out the last article of the series: how to import Python libraries!
- If you want to learn more about how to become a data scientist, take my 50-minute video course: How to Become a Data Scientist. (It’s free!)
- Also check out my 6-week online course: The Junior Data Scientist’s First Month video course.