6  pathlib: Manipulate File Paths

6.1 Why pathlib

Traditional string based file path manipulations can be error-prone and cumbersome.

import os
import pandas as pd

# Works on Windows, but breaks on Mac/Linux
path = "data\\raw\\report.csv"
df = pd.read_csv(path)

For example, the above code uses backslashes (\) which are specific to Windows file paths. This code will break on Mac or Linux systems that use forward slashes (/) instead. You could use os.path.join() to make it platform-independent, but it can get messy with complex paths. You could also use forward slashes, which work on both Windows and Mac/Linux, but it can be less readable for those accustomed to their platform’s conventions. The bottom line is that string based paths can lead to bugs and maintenance challenges.

The pathlib module in Python provides an object-oriented approach to handle filesystem paths, making it easier to read, write, and manipulate paths in a platform-independent way.

6.2 Using pathlib

To get started with pathlib, you need to import the Path class from the pathlib module. The above code can be rewritten using pathlib as follows:

from pathlib import Path
import pandas as pd

path = Path("data") / "raw" / "report.csv"
df = pd.read_csv(path)

In this example, we create a Path object representing the file path. The / operator is overloaded in the Path class to join path components in a platform-independent way. This makes the code more readable and less error-prone.

Here is another example. Suppose we have a notes folder with many .txt files and want to

  • Change their extension to .md
  • Prefix the filename with “archived_”
from pathlib import Path

folder = Path("notes")

for path in folder.glob("*.txt"):
    new_path = path.with_name(f"archived_{path.stem}").with_suffix(".md")
    path.replace(new_path)
    print(f"Renamed {path} to {new_path}")

The code above does the following:

  • folder.glob() finds all .txt files in the notes directory
  • path.stem gives filename without the extension (reliable)
  • with_name() safely constructs a new filename
  • with_suffix() changes the file extension
  • path.replace() moves/renames cleanly

The code is easy to read and works on any OS without changes.

6.3 Learning Resources

pathlib has many practical use cases. To learn more about its features and capabilities, the following resources may be helpful: