Programmer's Python Data - Text Files & CSV

Written by Mike James

Tuesday, 10 June 2025

Article Index
Programmer's Python Data - Text Files & CSV
Text Formats
The CSV Module
CSV Dialects

Page 1 of 4

Files are fundamental to computing and text files are human readable - most of the time. Find out how to understand and work with CSV files in this extract from Programmer's Python: Everything is Data.

Programmer's Python
Everything is Data

Is now available as a print book: Amazon

pythondata360Contents

Python – A Lightning Tour
The Basic Data Type – Numbers
Extract: Bignum
Truthy & Falsey
Dates & Times
Extract Naive Dates
Sequences, Lists & Tuples
Extract Sequences
Strings
Extract Unicode Strings
Regular Expressions
Extract Simple Regular Expressions
The Dictionary
Extract The Dictionary
Iterables, Sets & Generators
Extract Iterables
Comprehensions
Extract Comprehensions
Data Structures & Collections
Extract Stacks, Queues and Deques
Extract Named Tuples and Counters
Bits & Bit Manipulation
Extract Bits and BigNum
Extract Bit Masks ***NEW!!!
Bytes
Extract Bytes And Strings
Extract Byte Manipulation
Binary Files
Extract Files and Paths
Text Files
Extract Text Files & CSV
Creating Custom Data Classes
Extract A Custom Data Class
Python and Native Code
Extract Native Code
Appendix I Python in Visual Studio Code
Appendix II C Programming Using Visual Studio Code

<ASIN:1871962765>

<ASIN:1871962749>

<ASIN:1871962595>

<ASIN:B0CK71TQ17>

<ASIN:187196265X>

While text mode is usually regarded as the simpler option for using files, there are arguments that it is the more complex due to the variations that are possible in data representation and meaning. It is also the case that most people suggest that you should use text mode for your custom files because they are human readable and editable using nothing but a text editor. This is an advantage over binary format files, but it also makes it possible for users to attempt to manually modify files, often with unexpected outcomes and errors. It also used to be argued that binary files were better because they were more compact and hence faster to work with and used less space. Today this is hardy an advantage with storage no longer being in short supply.

The current situation is that text files do have the advantage of being human readable and editable, but this isn’t always desirable. Binary files for internal consumption still have advantages and, even if you don’t create them yourself, you cannot avoid encountering them.

If you do decide to use text files to store data you have the problem of extracting the data and converting it to internal data types. In most cases this requires the file to have a fixed format so that you can parse it. What this means is that discussing text files leads on naturally to the consideration of standard data file formats and in this chapter we also look at CSV, JSON, XML and pickle.

Opening a Text File

A text file is nothing more than a binary file that is treated as if it was an encoding of a text string. You can achieve the same result using a binary file and explicit calls to decode/encode, but opening a file in text mode performs this automatically and the read and write both work in terms of Python strings.

If you open a file in text mode you have to specify the encoding in use, i.e. how the bytes in the file represent the Unicode text. You have to specify the encoding parameter in the call to open.

For example:

open(path, mode='rt', encoding="utf8")

path.open(mode='rt', encoding="utf8")

to work with UTF-8 encoding.

Once the file has been opened in text mode the write method accepts a string and the read method returns a string. It really is this simple – the string is decoded to a UTF-8 bytes stream when written to the file and encoded back to a Unicode string when read from the file, for example:

with path.open(mode="wt") as f:
 f.write("Hello World")
with path.open(mode="rt") as f:
 myString=f.read()
print(myString)

In this case we write a string to the file and then read the entire file back – which is simply the string we wrote. In general, there will be a conversion between representations and as such there is the possibility that the conversion cannot be performed, i.e. that a character in one of the encodings cannot be represented in the other. In this case by default you will generate a ValueError exception. You can control what happens by setting the errors parameter in the open function which works in exactly the same way as in decode/encode, see Chapter 6.

Prev - Next >>

Last Updated ( Tuesday, 10 June 2025 )

Programmer's PythonEverything is Data

Is now available as a print book: Amazon

pythondata360Contents

Opening a Text File

Programmer's Python
Everything is Data