homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add preferred extensions for MIME types
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: The Compiler, Tom.Christie, ajaksu2, cvrebert, david.lindquist, elesbom, eric.araujo, evanj, ezio.melotti, iritkatriel, jlgijsbers, kxroberto, lambacck, leos, martin.panter, ptarjan, sandro.tosi, sascha_silbe, wichert
Priority: normal Keywords: patch

Created on 2004年10月08日 15:44 by kxroberto, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue1043134.patch elesbom, 2010年11月21日 17:09 review
mimetypes.patch david.lindquist, 2014年03月01日 18:22 review
Messages (21)
msg54278 - (view) Author: kxroberto (kxroberto) Date: 2004年10月08日 15:44
Instead of returning the first in the list of
extensions it should return the most reasonable . here:
to have a *.txt on disk after saveing?
msg54279 - (view) Author: Johannes Gijsbers (jlgijsbers) * (Python triager) Date: 2004年10月09日 15:26
Logged In: YES 
user_id=469548
How would you suggest finding out what the most reasonable
extension for a mime type is?
msg54280 - (view) Author: kxroberto (kxroberto) Date: 2004年10月10日 08:44
Logged In: YES 
user_id=972995
in mimetypes.py there is already a
common_types = {
 '.jpg' : 'image/jpg',
...
.txt could be added,
mayby guess_extension should first reverse-take it out of
there, not random ...?
background: my intent was to save MIME attachment as
(startable) temporary file. yet got wonderful .ksh's for
textfiles, and had to fumble ...
 
msg54281 - (view) Author: Johannes Gijsbers (jlgijsbers) * (Python triager) Date: 2004年10月11日 20:15
Logged In: YES 
user_id=469548
common_types is for adding some non-standard types, not for
determining which extension is most reasonable. I'll be
happy to look at a decent patch, but I'm moving this to
feature request until then.
msg54282 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2004年12月19日 00:44
Logged In: YES 
user_id=341410
While I agree with the original poster that returning '.txt'
is preferable to the others in the list returned by
mimetypes.guess_all_extensions() at least 9 times out of 10,
being able to prioritize all of the types is not necessarily
the easiest thing to do for all of the possible returned lists.
Is using a custom comparison function along with the list
returned by guess_all_extensions() sufficient?
msg82101 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009年02月14日 18:21
Confirmed on trunk.
msg114379 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010年08月19日 16:51
I'll close this in a couple of weeks unless someone wants it kept open.
msg114461 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010年08月20日 21:50
I think you are closing too aggressively.
Python 3.2a0 (py3k:81783, Jun 6 2010, 16:07:26) 
[GCC 4.1.3 20080623 (prerelease) (Ubuntu 4.1.2-23ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mimetypes
>>> mimetypes.guess_extension('text/plain')
'.ksh'
>>>
msg121798 - (view) Author: Chris Lambacher (lambacck) * Date: 2010年11月20日 22:25
While I agree that getting .ksh is an unfortunate guess, I am not sure how you can guess in the face of many options (especially when the those options are parsed out of a mimetypes file or the windows registry). 
Perhaps there should be a "resonable_defaults" map that is checked first for very basic types where there are multiple extensions for a type?
msg121867 - (view) Author: Paul Tarjan (ptarjan) Date: 2010年11月21日 05:41
6 years old and still not fixed?
http://www.stdicon.com/mimetype/text/plain
Please return txt
msg121953 - (view) Author: Rafael dos Santos Gonçalves (elesbom) Date: 2010年11月21日 17:09
ksh is a text/plain to, all this extension are text/plain:
'.ksh', '.pl', '.bat', '.h', '.c', '.txt', '.asc', '.text', '.pot', '.brf'.
The problem is: the code return the first of list:
return extensions[0]
So, I add one boolean parameter in method guess_extension called all_exts. Putting True in this parameter the method returns a tuple with all possible extensions.
I hope helped
msg121966 - (view) Author: Chris Lambacher (lambacck) * Date: 2010年11月21日 19:18
Rafael,
There is already a method which returns all the extensions. What is required is a flag (or separate dict) which provides a canonical extension. The questions is whether it is sufficient to rely on the default provided mimetypes for the default in the face of mimetypes read out of the mimetypes files or windows registry.
I don't see a way to fix the bug, without also providing an API to "pick the winner" for those cases that are not provided in the default list.
msg140264 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011年07月13日 15:06
The proposed patch does not solve the issue. In the current API, there is no way to do it, so this bug requires a new feature. I think it would involve a new dict, like preferred_extensions, which would be seeded with default values, like .jpg for image/jpeg and .txt for text/plain, and a few functions/methods to query the dict or add items.
msg143665 - (view) Author: Leo Shklovskii (leos) Date: 2011年09月07日 08:20
I'm running into a similar issue with this function. My bug is that get_type('foo.png') returns image/x-png. This occurs on windows because there are mappings to both image/png and image/x-png in the registry (as there should be, since that key is actually a reverse mapping) and the code simply picks the first key that it enumerates over. This issue strikes in both directions.
Chris and others bring up a valid issue: how to decide what the winning result is?
I think the answer is pretty clear - you use the common_types mapping already in the file and expand it as appropriate. If the mimetype can't be found, only then do you go to the windows registry. The behavior on Linux is even stranger to me (now we'll dig through an arbitrary list of files that might contain MIME info or may have completely irrelevant data) but it's a pragmatic solution.
If someone needs to customize what guess_type returns, they can simply wrap the guess_type function in their own code or monkey patch if they don't have access to the source they're running. Changing such a mime type is a really advanced and unusual operation. If that's unacceptable, the code can provide a hook for an 'apache MIME config' file on windows in a standard place (either pythonpath, or %system% or wherever) that it will check before going to common_types or to the registry.
Making this change doesn't require changing the API at all, just the implementation changes.
msg212518 - (view) Author: David Lindquist (david.lindquist) * Date: 2014年03月01日 18:22
I don't think it is unreasonable to return a well-known extension for certain mime types, text/plain being the most obvious (and most in need of repair; .ksh??).
I've attached a patch based on the previous discussion.
msg214951 - (view) Author: Wichert Akkerman (wichert) Date: 2014年03月27日 13:06
Here is a related question on SO: http://stackoverflow.com/questions/352837/how-to-add-file-extensions-based-on-file-type-on-linux-unix 
msg215571 - (view) Author: David Lindquist (david.lindquist) * Date: 2014年04月04日 22:02
Anyone interested in picking this up, or at least commenting on the approach I suggested in the patch? Seems like an easy fix for a long-standing bug.
msg226466 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014年09月06日 02:39
See also <https://bugs.python.org/issue6626#msg91205>, which mentions using a list of tuples instead of a dictionary, which sounds like it might help with this issue. Doing it that way you might be able avoid some duplication in the lists.
msg277024 - (view) Author: Tom Christie (Tom.Christie) Date: 2016年09月20日 12:19
Confirming that I've also bumped into this for Python 3.5.
A docs update would seem to be the lowest-cost option to start with.
Right now `mimetypes.guess_extension()` isn't terribly useful, and it'd be better to at least know that upfront.
msg384346 - (view) Author: Florian Bruhin (The Compiler) * Date: 2021年01月04日 20:21
I think this has been fixed in Python 3.7+ via https://github.com/python/cpython/pull/14375 - at least for a couple of types.
Comparing Python 3.6 with the current state, the following changed (which can be used as an "override" dict before calling mimetypes.guess_extension):
 "application/manifest+json": ".webmanifest", # not None
 "application/octet-stream": ".bin", # not .a
 "application/postscript": ".ps", # not .ai
 "application/vnd.ms-excel": ".xls", # not .xlb
 "application/vnd.ms-powerpoint": ".ppt", # not .pot
 "application/wasm": ".wasm", # not None
 "application/x-hdf5": ".h5", # not None
 "application/xml": ".xsl", # not .rdf
 "audio/mpeg": ".mp3", # not .mp2
 "image/jpeg": ".jpg", # not .jpe
 "image/tiff": ".tiff", # not .tif
 "text/html": ".html", # not .htm
 "text/plain": ".txt", # not .bat
 "video/mpeg": ".mpeg", # not .m1v
msg410859 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2022年01月18日 13:02
PR14375 indeed adds a test for this as well (test_preferred_extension).
History
Date User Action Args
2022年04月11日 14:56:07adminsetgithub: 40993
2022年01月18日 13:02:58iritkatrielsetstatus: open -> closed

versions: + Python 3.8, - Python 3.5
nosy: + iritkatriel

messages: + msg410859
resolution: fixed
stage: patch review -> resolved
2021年01月04日 20:21:06The Compilersetnosy: + The Compiler
messages: + msg384346
2019年05月02日 04:17:48josiahcarlsonsetnosy: - josiahcarlson
2018年07月08日 13:08:22sascha_silbesetnosy: + sascha_silbe
2017年03月16日 04:25:15martin.panterlinkissue29823 superseder
2016年09月20日 12:19:02Tom.Christiesetnosy: + Tom.Christie
messages: + msg277024
2014年09月06日 02:39:08martin.pantersetnosy: + martin.panter
messages: + msg226466
2014年06月19日 22:20:15ezio.melottisetstage: test needed -> patch review
versions: + Python 3.5, - Python 3.4
2014年05月16日 05:16:59cvrebertsetnosy: + cvrebert
2014年05月13日 22:13:11skrahsetnosy: - skrah
2014年04月04日 22:02:49david.lindquistsetmessages: + msg215571
2014年03月27日 13:06:28wichertsetnosy: + wichert
messages: + msg214951
2014年03月01日 18:22:42david.lindquistsetfiles: + mimetypes.patch

nosy: + david.lindquist
messages: + msg212518

keywords: + patch
2014年02月03日 19:07:15BreamoreBoysetnosy: - BreamoreBoy
2012年10月11日 11:52:01ezio.melottisetnosy: + ezio.melotti

versions: + Python 3.4, - Python 2.7, Python 3.3
2012年10月10日 15:37:11evanjsetnosy: + evanj
2011年09月07日 08:20:06leossetnosy: + leos

messages: + msg143665
versions: + Python 2.7
2011年08月20日 22:40:11sandro.tosisetnosy: + sandro.tosi
2011年07月13日 15:06:30eric.araujosetkeywords: - patch, easy

messages: + msg140264
title: mimetypes.guess_extension('text/plain') == '.ksh' ??? -> Add preferred extensions for MIME types
2011年03月09日 03:27:09terry.reedysetnosy: jlgijsbers, josiahcarlson, kxroberto, ajaksu2, eric.araujo, lambacck, skrah, ptarjan, BreamoreBoy, elesbom
versions: + Python 3.3, - Python 3.1, Python 2.7, Python 3.2
2010年11月21日 19:18:33lambaccksetmessages: + msg121966
2010年11月21日 17:09:17elesbomsetfiles: + issue1043134.patch

nosy: + elesbom
messages: + msg121953

keywords: + patch
2010年11月21日 05:41:08ptarjansetnosy: + ptarjan
messages: + msg121867
2010年11月21日 01:15:53eric.araujolinkissue6799 superseder
2010年11月20日 22:30:48eric.araujosetnosy: + eric.araujo
2010年11月20日 22:25:44lambaccksetnosy: + lambacck
messages: + msg121798
2010年08月20日 21:50:29skrahsetstatus: pending -> open
versions: + Python 3.1, Python 3.2
nosy: + skrah

messages: + msg114461
2010年08月19日 16:51:17BreamoreBoysetstatus: open -> pending
nosy: + BreamoreBoy
messages: + msg114379

2009年04月22日 16:04:08ajaksu2setkeywords: + easy
2009年02月14日 18:21:36ajaksu2setnosy: + ajaksu2
stage: test needed
messages: + msg82101
versions: + Python 2.7
2004年10月08日 15:44:17kxrobertocreate

AltStyle によって変換されたページ (->オリジナル) /