Programmer's Python Data - Text Files & CSV
Written by Mike James
Tuesday, 10 June 2025
Article Index
Programmer's Python Data - Text Files & CSV
Text Formats
The CSV Module
CSV Dialects
Page 1 of 4

Files are fundamental to computing and text files are human readable - most of the time. Find out how to understand and work with CSV files in this extract from Programmer's Python: Everything is Data.

Programmer's Python
Everything is Data

Is now available as a print book: Amazon

pythondata360Contents

  1. Python – A Lightning Tour
  2. The Basic Data Type – Numbers
    Extract: Bignum
  3. Truthy & Falsey
  4. Dates & Times
    Extract Naive Dates
  5. Sequences, Lists & Tuples
    Extract Sequences
  6. Strings
    Extract Unicode Strings
  7. Regular Expressions
    Extract Simple Regular Expressions
  8. The Dictionary
    Extract The Dictionary
  9. Iterables, Sets & Generators
    Extract Iterables
  10. Comprehensions
    Extract Comprehensions
  11. Data Structures & Collections
    Extract Stacks, Queues and Deques
    Extract Named Tuples and Counters
  12. Bits & Bit Manipulation
    Extract Bits and BigNum
    Extract Bit Masks ***NEW!!!
  13. Bytes
    Extract Bytes And Strings
    Extract Byte Manipulation
  14. Binary Files
    Extract Files and Paths
  15. Text Files
    Extract Text Files & CSV
  16. Creating Custom Data Classes
    Extract A Custom Data Class
  17. Python and Native Code
    Extract Native Code
    Appendix I Python in Visual Studio Code
    Appendix II C Programming Using Visual Studio Code

<ASIN:1871962765>

<ASIN:1871962749>

<ASIN:1871962595>

<ASIN:B0CK71TQ17>

<ASIN:187196265X>

While text mode is usually regarded as the simpler option for using files, there are arguments that it is the more complex due to the variations that are possible in data representation and meaning. It is also the case that most people suggest that you should use text mode for your custom files because they are human readable and editable using nothing but a text editor. This is an advantage over binary format files, but it also makes it possible for users to attempt to manually modify files, often with unexpected outcomes and errors. It also used to be argued that binary files were better because they were more compact and hence faster to work with and used less space. Today this is hardy an advantage with storage no longer being in short supply.

The current situation is that text files do have the advantage of being human readable and editable, but this isn’t always desirable. Binary files for internal consumption still have advantages and, even if you don’t create them yourself, you cannot avoid encountering them.

If you do decide to use text files to store data you have the problem of extracting the data and converting it to internal data types. In most cases this requires the file to have a fixed format so that you can parse it. What this means is that discussing text files leads on naturally to the consideration of standard data file formats and in this chapter we also look at CSV, JSON, XML and pickle.

Opening a Text File

A text file is nothing more than a binary file that is treated as if it was an encoding of a text string. You can achieve the same result using a binary file and explicit calls to decode/encode, but opening a file in text mode performs this automatically and the read and write both work in terms of Python strings.

If you open a file in text mode you have to specify the encoding in use, i.e. how the bytes in the file represent the Unicode text. You have to specify the encoding parameter in the call to open.

For example:

open(path, mode='rt', encoding="utf8")

or

path.open(mode='rt', encoding="utf8")

to work with UTF-8 encoding.

Once the file has been opened in text mode the write method accepts a string and the read method returns a string. It really is this simple – the string is decoded to a UTF-8 bytes stream when written to the file and encoded back to a Unicode string when read from the file, for example:

with path.open(mode="wt") as f:
 f.write("Hello World")
with path.open(mode="rt") as f:
 myString=f.read()
print(myString)

In this case we write a string to the file and then read the entire file back – which is simply the string we wrote. In general, there will be a conversion between representations and as such there is the possibility that the conversion cannot be performed, i.e. that a character in one of the encodings cannot be represented in the other. In this case by default you will generate a ValueError exception. You can control what happens by setting the errors parameter in the open function which works in exactly the same way as in decode/encode, see Chapter 6.


Prev - Next >>

Last Updated ( Tuesday, 10 June 2025 )