0

I am trying to convert an email in a readable format from a chunk of an mbox file but get a mojibake (wrong da?s instead of das) when opening it with neovim (with set encoding=utf-8 and set fileencodings=utf-8 in the .vimrc)

Here is the email file emailfile.txt:

From 9999999999999999@xxx Tue Mar 09 17:00:00 +0500 2019 
X-GM-THRID: 99999999999999999
X-mail-Labels: Archived,Sent,Opened
MIME-Version: 1.0
Date: 2019年3月09日 17:00:00 +0500
Message-ID: <[email protected]>
Subject: THETITLE
From: My Name <[email protected]>
To: [email protected]
Content-Type: multipart/alternative; boundary="0000000000009999999999999"
--0000000000009999999999999
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ da=
s ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
--0000000000009999999999999
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div>ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ=
ZZ das ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ=
>
--0000000000009999999999999--

I unpack it with the command

$ munpack -t emailfile.txt
part1 (text/plain)
part2 (text/html)
$ cat part1
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ das ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

However here is what I get when I open it with vim (notice the superfluous ?:


ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ da?s ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

might it come from the come the a= from plain text of email?

How can I open the file with vim so that it displays das without the ??

EDIT

As requested in the comments:

$ locale
LANG=""
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

Here is the output of vim -Nu NONE part1

ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ daÿs ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
^M

Now tje ? charcter is changed in a ÿ

asked Apr 5, 2024 at 16:23
2
  • 1
    $ vim -Nu NONE part1 opens the file correctly, here. Could you add the output of $ locale to the body of your question? Also, utf-8 is a bad value for :help 'fileencodings'. Commented Apr 6, 2024 at 12:06
  • many thans for your comment @romaini. Unfortunately the command $ vim -Nu NONE part1 did not solve the problem. I edited the question accordingly Commented Apr 6, 2024 at 22:05

1 Answer 1

0

I could finally find the answer in another SO question (Encoding issue : decode Quoted-Printable string in Python)

Here is the python script:

import quopri
with open("emailfile.txt") as f:
 files = quopri.decodestring(f.read().rstrip())
 print(files.decode('latin-1'))
answered May 31, 2024 at 17:10
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.