Splitting, Concatenating, and Joining Strings

Splitting, Concatenating, and Joining Strings in Python

There are few guarantees in life: death, taxes, and programmers needing to deal with strings. Strings can come in many forms. They could be unstructured text, usernames, product descriptions, database column names, or really anything else that we describe using language.

With the near-ubiquity of string data, it’s important to master the tools of the trade when it comes to strings. Luckily, Python makes string manipulation very simple, especially when compared to other languages and even older versions of Python.

In this article, you will learn some of the most fundamental string operations: splitting, concatenating, and joining. Not only will you learn how to use these tools, but you will walk away with a deeper understanding of how they work under the hood.

Take the Quiz: Test your knowledge with our interactive “Splitting, Concatenating, and Joining Strings in Python” quiz. Upon completion you will receive a score so you can track your learning progress over time:

Take the Quiz »

Free Bonus: Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions. Remove ads

Splitting Strings

In Python, strings are represented as str objects, which are immutable: this means that the object as represented in memory can not be directly altered. These two facts can help you learn (and then remember) how to use .split().

Have you guessed how those two features of strings relate to splitting functionality in Python? If you guessed that .split() is an instance method because strings are a special type, you would be correct! In some other languages (like Perl), the original string serves as an input to a standalone .split() function rather than a method called on the string itself.

Note: Ways to Call String Methods

String methods like .split() are mainly shown here as instance methods that are called on strings. They can also be called as static methods, but this isn’t ideal because it’s more “wordy.” For the sake of completeness, here’s an example:

# Avoid this:
str.split('a,b,c', ',')

This is bulky and unwieldy when you compare it to the preferred usage:

# Do this instead:
'a,b,c'.split(',')

For more on instance, class, and static methods in Python, check out our in-depth tutorial.

What about string immutability? This should remind you that string methods are not in-place operations, but they return a new object in memory.

Note: In-Place Operations

In-place operations are operations that directly change the object on which they are called. A common example is the .append() method that is used on lists: when you call .append() on a list, that list is directly changed by adding the input to .append() to the same list.#Splitting Without Parameters

Before going deeper, let’s look at a simple example:>>>

>>> 'this is my string'.split()

#['this', 'is', 'my', 'string']

This is actually a special case of a .split() call, which I chose for its simplicity. Without any separator specified, .split() will count any whitespace as a separator.

Another feature of the bare call to .split() is that it automatically cuts out leading and trailing whitespace, as well as consecutive whitespace. Compare calling .split() on the following string without a separator parameter and with having ' ' as the separator parameter:>>>

>>> s = ' this   is  my string '
>>> s.split()

#['this', 'is', 'my', 'string']

>>> s.split(' ')
#['', 'this', '', '', 'is', '', 'my', 'string', '']

The first thing to notice is that this showcases the immutability of strings in Python: subsequent calls to .split() work on the original string, not on the list result of the first call to .split().

The second—and the main—thing you should see is that the bare .split() call extracts the words in the sentence and discards any whitespace.

Specifying Separators

.split(' '), on the other hand, is much more literal. When there are leading or trailing separators, you’ll get an empty string, which you can see in the first and last elements of the resulting list.

Where there are multiple consecutive separators (such as between “this” and “is” and between “is” and “my”), the first one will be used as the separator, and the subsequent ones will find their way into your result list as empty strings.

Note: Separators in Calls to .split()

While the above example uses a single space character as a separator input to .split(), you aren’t limited in the types of characters or length of strings you use as separators. The only requirement is that your separator be a string. You could use anything from "..." to even "separator".

Limiting Splits With Maxsplit

.split() has another optional parameter called maxsplit. By default, .split() will make all possible splits when called. When you give a value to maxsplit, however, only the given number of splits will be made. Using our previous example string, we can see maxsplit in action:>>>

>>> s = "this is my string"
>>> s.split(maxsplit=1)

#['this', 'is my string']

As you see above, if you set maxsplit to 1, the first whitespace region is used as the separator, and the rest are ignored. Let’s do some exercises to test out everything we’ve learned so far.

Exercise: "Try It Yourself: Maxsplit"Show/Hide

Solution: "Try It Yourself: Maxsplit"Show/Hide

Exercise: "Section Comprehension Check"Show/Hide

Solution: "Section Comprehension Check"Show/Hide Remove ads

Concatenating and Joining Strings

The other fundamental string operation is the opposite of splitting strings: string concatenation. If you haven’t seen this word, don’t worry. It’s just a fancy way of saying “gluing together.”

Concatenating With the + Operator

There are a few ways of doing this, depending on what you’re trying to achieve. The simplest and most common method is to use the plus symbol (+) to add multiple strings together. Simply place a + between as many strings as you want to join together:>>>

>>> 'a' + 'b' + 'c'

#'abc'

In keeping with the math theme, you can also multiply a string to repeat it:>>>

>>> 'do' * 2

#'dodo'

Remember, strings are immutable! If you concatenate or repeat a string stored in a variable, you will have to assign the new string to another variable in order to keep it.>>>

>>> orig_string = 'Hello'
>>> orig_string + ', world'
#'Hello, world'
>>> orig_string
#'Hello'
>>> full_sentence = orig_string + ', world'
>>> full_sentence
#'Hello, world'

If we didn’t have immutable strings, full_sentence would instead output 'Hello, world, world'.

Another note is that Python does not do implicit string conversion. If you try to concatenate a string with a non-string type, Python will raise a TypeError:>>>

>>> 'Hello' + 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: must be str, not int

This is because you can only concatenate strings with other strings, which may be new behavior for you if you’re coming from a language like JavaScript, which attempts to do implicit type conversion.

Going From a List to a String in Python With .join()

There is another, more powerful, way to join strings together. You can go from a list to a string in Python with the join() method.

The common use case here is when you have an iterable—like a list—made up of strings, and you want to combine those strings into a single string. Like .split(), .join() is a string instance method. If all of your strings are in an iterable, which one do you call .join() on?

This is a bit of a trick question. Remember that when you use .split(), you call it on the string or character you want to split on. The opposite operation is .join(), so you call it on the string or character you want to use to join your iterable of strings together:>>>

>>> strings = ['do', 're', 'mi']
>>> ','.join(strings)

#'do,re,mi'

Here, we join each element of the strings list with a comma (,) and call .join() on it rather than the strings list.

Exercise: "Readability Improvement with Join"Show/Hide

Solution: "Readability Improvement with Join"Show/Hide

.join() is smart in that it inserts your “joiner” in between the strings in the iterable you want to join, rather than just adding your joiner at the end of every string in the iterable. This means that if you pass an iterable of size 1, you won’t see your joiner:>>>

>>> 'b'.join(['a'])

#'a'

Exercise: "Section Comprehension Check"Show/Hide

Solution: "Section Comprehension Check"Show/Hide Remove ads

Tying It All Together

While this concludes this overview of the most basic string operations in Python (splitting, concatenating, and joining), there is still a whole universe of string methods that can make your experiences with manipulating strings much easier.

Once you have mastered these basic string operations, you may want to learn more. Luckily, we have a number of great tutorials to help you complete your mastery of Python’s features that enable smart string manipulation:

Reference : https://realpython.com/python-string-split-concatenate-join/

Last updated