A Beginner’s Guide to Python Quirks and Jargon

October 05, 2019

The data science boom has welcomed a lot of new Python developers from many different walks of life: from seasoned hackers, statisticians, and mathematicians, to business executives, journalists, and basically anyone who wants to be part of the inevitable data revolution. We are in exciting times, people. We are in the cusp of the fourth industrial revolution, and everyone wants in.

For the beginner programmer, it’s mighty convenient that Python is one of the two famous programming languages for data science. With its succinct syntax, easy readability, and, closeness to natural language, it’s no wonder that 70% of introductory programming courses in US universities teach it. I can think of no gentler introduction to programming.

With all the online courses you can take, and all the blog posts and tutorials sprawled across the interwebs, there’s no shortage of resources for learning the language. Furthermore, thanks to sites like stackoverflow, there will always be a community of other programmers to help you should you have any questions (most of us are nice, well, most of the time).

Gaining knowledge has never been easier, and I’d like to invite all of you to join me and appreciate that for a minute.

But while these courses and tutorials can quickly get you up to speed with the basics of the language and the relevant data science libraries — pandas, numpy, matplotlib, and, sklearn, to name a few — most barely scratch the intricacies of Python. Despite its simplicity, Python is a vast and rich language, and it pays (literally) to know its insides and outs.

Why It Matters

You can probably get by by just knowing the basics and just knowing enough of the language to be functional in your domain, but there are advantages to knowing the quirks of what you’re working with:

  • You can write better, more creative code. Programming, at its core, is problem solving. Having more knowledge of the language allows you to consider multiple solutions to a problem.
  • **You can potentially speed up your work. **You probably won’t experience performance problems with small data sets. Things get more dicey and less forgiving when you start working with at least a few thousands of rows. Knowing which language constructs work faster than others can transform a script that has to run for 34 hours, to one that only has to run for 1 minute (I’ll write about this soon!).
  • Your future self will thank you. Ever worked on code that you haven’t touched for six months? No? Let me tell you right now: it’s not a pleasant experience. You are no longer the problem domain expert you were six months ago. Writing code that is clean and consistent, i.e. code that is pythonic, will guide you to a habit of writing in a pattern of style, resulting in code that is readable and predictable. This will definitely save you hours of hair pulling.
  • Lastly, pride in your own work. Okay, this is a personal one, but still one I’d like to share. Time is a very valuable resource. If you’re going to spend it creating something, might as well create something that you can be proud of. For me, in this case, it’s aligning what I code with how the language was designed. While programming is often packaged as logical problem solving, at a design level, it’s more art than engineering. This is where your creativity can shine.

What You Should Know

Learn these simple concepts and you will easily be a better Python developer:

**Empty Sequences are False, Sequences with Elements are True. **It’s not necessary to get the length of a list, dictionary, or a set when used in conditionals:

my_list = []
if not my_list:
print('List is empty')
# This could also work, but is unnecessary
if len(my_list) == 0:
print('List is empty')
if my_list:
print('List is not empty')
# outputs:
List is empty
List is empty
List is not empty
view raw list_truthiness.py hosted with ❤ by GitHub

**Iterate through sequences using enumerate(). **With the risk of sounding like an old man, it used to be that iterating through a list and its index needed a combination of len() and range(). enumerate() makes it simpler:

fruits = ['apples', 'oranges', 'bananas']
# yikes, archaic
for i in range(len(fruits)):
print(i, fruits[i])
# So much better
for i, fruit in enumerate(fruits):
print(i, fruit)

You can create lists from strings. Say you wanted to create a list of categories denoted by letters, it’s easier to make these into a string and pass it into the list constructor instead of making it by hand:

# Record yourself trying to type this
categories = ['A', 'B', 'C', 'D', 'B', 'A', 'C', 'C', 'A']
# Over this
categories = list('ABCDBACCA')
view raw list_constructor.py hosted with ❤ by GitHub

Tuple Packing, Sequence Unpacking. These can be used whenever you find yourself needing to group multiple items together or to extract individual items from a sequence. These are called tuple packing and sequence unpacking, respectively.

# say you have a pair of cartesian coordinates x and y
x = 1
y = 16
# you can `pack` them into a tuple like so
point = x, y
print(point) # => (1, 16)
# and `unpack` them like so
new_x, new_y = point
print(new_x) # => 1
print(new_y) # => 16

Swapping values between two variables can be done by taking advantage of packing and unpacking:

x, y = y, x
print(x) # => 16
print(y) # => 1
view raw swapping.py hosted with ❤ by GitHub

Be careful in using this. While it increases the semantic relationship between x and y , it can very much as well provide an erroneous semantic relationship between two very much unrelated variables.

F-strings. As of Python 3.6, you can use f-strings to easily pepper your strings with arbitrary expressions. For users of older versions of Python, there’s an easy solution: upgrade! If that’s not an option, you can use the string.format()method:

# Let's create a madlib
year = 2016
event = 'a party with Barrack Obama'
adverb = 'hella'
adjective = 'amazing'
# No, that's not a typo.
# Add 'f' before your strings to make them into f-strings
print(f'It was back in {year} when we had {event}. It was {adverb} {adjective}.')
# => It was back in 2016 when we had a party with Barrack Obama. It was hella amazing.
# For users of Python versions older than 3.6
# Use string formatting
print('It was back in {} when we had {}. It was {} {}.'.format(year, event, adverb, adjective))
# => It was back in 2016 when we had a party with Barrack Obama. It was hella amazing.
view raw f_string.py hosted with ❤ by GitHub

The f-string evaluates the expressions inside the brackets and automatically includes them in the string. The format method sequentially replaces the curly braces inside the string with the arguments passed to it. As you can see, the f-string is much more succinct.

*args, **kwargs. *args and **kwargs allow your functions to accept an arbitrary number of positional and keyword arguments. This can be useful when you don’t know the number of arguments your function is going to receive before hand. Inside the function, *args can be accessed as a list, and **kwargs can be accessed as a dictionary:

def sum_all(*args):
result = 0
# args is a list
for arg in args:
result += arg
return result
result = sum_all(1, 2, 3, 4, 5)
print(result) # => 15
# args and kwargs are just arbitrary names
# they can be named whatever you like
def make_pizza(**toppings):
# toppings is a dictionary
print(f'Making pizza with {toppings['main']} and extra {toppings['extra']})
make_pizza(main='bacon', extra='cheese') # => Making pizza with bacon and extra cheese
view raw args_kwargs.py hosted with ❤ by GitHub

While not found in your day to day programming, knowledge of these is useful for when you encounter them in the wild.

List Comprehension. List comprehensions allow you to create new lists and dictionaries from existing ones. They provide a succinct syntax and are preferable over for loops. Examples would help illustrate this.

Say you want to double all the values in a list:

# The basic syntax is as follows:
# new_list = [<expression> <for loop(s)> <conditions>]
my_list = [1, 2, 3, 4, 5]
doubled = [x*2 for x in my_list] # no conditions in this case
print(doubled) # => [2, 4, 6, 8, 10]
# Equivalent to the following for loop:
doubled = []
for x in my_list:

Or maybe find the values in a list that are odd:

my_list = [1, 2, 5, 6, 9, 21, 30]
odds = [x for x in my_list if x % 2 == 1]
print(odds) # => [1, 5, 9, 21]
# Equivalent to the following for loop:
odds = []
for x in my_list:
if x % 2 == 1:

List comprehensions offer the following benefits:

  • Increased readability
  • Fewer lines of code
  • Potential increase in performance (appending to an array resizes it during runtime, which has a performance overhead)

Master list comprehensions and I can guarantee that you will already be a better Python developer.

Well, that’s a wrap. While there are many more things that I would’ve wanted to talk about, I wanted to keep this piece as succinct as I can while providing *immediately *relevant information to the reader. After all, we’re all busy.

Practice these, integrate them into your daily code, and experiment. I will guarantee you that your Python skills will improve, and you will be able to create code that you can be proud of.

Further Reading

There are much more things to learn about Python, both for the things I’ve mentioned and wasn’t able to. If you want to explore more, I recommend these following articles:

Share on TwitterEdit on Github

Hi! I'm Adrian. I'm a software engineer, and I work hard to provide helpful and highly intuitive content for free. If you like what you read, please consider following me on Twitter. Hope to see you again next time!

All materials © Adrian Perea 2020