3
\$\begingroup\$

I need a rather simple file encryptor/decryptor in Python, after some research, I decided to use tye pynacl library reading the file in blocks, writing them back out, and then at the end using Blake2b to generate a signature for the file. Each file is encrypted with a unique key, which will be distributed along side the encrypted file, with the file key RSA encrypted using a pre-shared key pair, and that whole message signed with ECDSA to verify it came from me.

The encryption/decryption example code:

import base64
import struct
import nacl.secret
import nacl.utils
import nacl.hashlib
import nacl.hash
BUFFER_SIZE = 4 * (1024 * 1024)
def read_file_blocks(file, extra_bytes=0):
 while True:
 data = file.read(BUFFER_SIZE + extra_bytes)
 if not data:
 break
 yield data
def hmac_file(file, key):
 blake = nacl.hashlib.blake2b(key=key)
 with open(file, 'rb') as in_file:
 for block in read_file_blocks(in_file):
 blake.update(block)
 return blake.hexdigest()
def encrypt_archive(archive_name, encrypted_name):
 key = nacl.utils.random(nacl.secret.SecretBox.KEY_SIZE)
 #Use 4 bytes less than the nonce size to make room for the block counter
 nonce = nacl.utils.random(nacl.secret.SecretBox.NONCE_SIZE - 4)
 block_num = 0
 box = nacl.secret.SecretBox(key)
 with open(archive_name, 'rb') as in_file, open(encrypted_name, 'wb') as out_file:
 for data in read_file_blocks(in_file):
 #Append the block counter to the nonce, so each block has a unique nonce
 block_nonce = nonce + struct.pack(">I", block_num)
 block = box.encrypt(data, block_nonce)
 out_file.write(block.ciphertext)
 block_num += 1
 hmac_key = nacl.hash.sha256(key + nonce, encoder=nacl.encoding.RawEncoder)
 output = {}
 output['key'] = base64.b64encode(key + nonce)
 output['signature'] = hmac_file(encrypted_name, hmac_key)
 return output
def decrypt_archive(encrypted_name, archive_name, key_info):
 key_bytes = base64.b64decode(key_info['key'])
 key = key_bytes[:nacl.secret.SecretBox.KEY_SIZE]
 nonce = key_bytes[nacl.secret.SecretBox.KEY_SIZE:]
 extra_bytes = nacl.secret.SecretBox.MACBYTES
 hmac_key = nacl.hash.sha256(key_bytes, encoder=nacl.encoding.RawEncoder)
 hmac = hmac_file(encrypted_name, hmac_key)
 if hmac != key_info['signature']:
 print('hmac mismatch')
 return
 block_num = 0
 box = nacl.secret.SecretBox(key)
 with open(encrypted_name, 'rb') as in_file, open(archive_name, 'wb') as out_file:
 # nacl adds a MAC to each block, when reading the file in, this needs to be taken into account
 for data in read_file_blocks(in_file, extra_bytes=extra_bytes):
 block_nonce = nonce + struct.pack(">I", block_num)
 block = box.decrypt(data, block_nonce)
 out_file.write(block)
 block_num += 1
key_info = encrypt_archive("C:\\temp\\test.csv", "C:\\temp\\test.enc")
print(key_info)
decrypt_archive("C:\\temp\\test.enc", "C:\\temp\\test.enc.csv", key_info)

Outside of general mistakes, the two things I'm doing I'm not entirely sure are sound:

  1. To keep the block nonces unique, I create a slightly smaller random list of bytes for the nonce than required, then when encrypting the blocks I append the block number, as a four byte integer to the nonce.

  2. When generating the blake2b hash, for a key, I hash the file key and nonce. This seems somewhat useless overall, since if they have the key and nonce, they could just replace the file. Although, I can't really think of a better alternative that doesn't have similar weaknesses. Should I just ditch that bit, since NaCl does per-block MACs anyhow? (which I found out only after I wrote the hmac code)

asked May 20, 2019 at 1:15
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

Code / protocol comments under the code sections.

def read_file_blocks(file, extra_bytes=0):
 while True:
 data = file.read(BUFFER_SIZE + extra_bytes)

At least specify why the extra bytes are used and document what extra-bytes argument means.

def hmac_file(file, key):

Why would you go over the entire file and then dispose of the read bytes only to perform HMAC? You just processed all the blocks in memory! Why blake2b instead of one of the more common SHA-2 hashes?

#Use 4 bytes less than the nonce size to make room for the block counter

I'll bet that this isn't required; undoubtedly the counter is already included. Maybe you can point to a specific requirement to include a block counter every 4MiB?

hmac_key = nacl.hash.sha256(key + nonce, encoder=nacl.encoding.RawEncoder)

At least use HMAC instead of a hash to derive keys, the is a poor man's KDF.

output['key'] = base64.b64encode(key + nonce)

Sorry, output what? The key?

output['signature'] = hmac_file(encrypted_name, hmac_key)

CryptoBox uses an authenticated cipher. No need at all to HMAC it again.

Algorithm details:
Encryption: Salsa20 stream cipher
Authentication: Poly1305 MAC

answered Mar 2, 2020 at 0:57
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.