homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: imghdr doesn't recognize variant jpeg formats
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Claudiu.Popa, ezio.melotti, intgr, jcea, joril, kovid, mvignali, r.david.murray, vstinner
Priority: normal Keywords: patch

Created on 2012年11月20日 10:21 by joril, last changed 2022年04月11日 14:57 by admin.

Files
File name Uploaded Description Edit
peanuts15.jpg joril, 2012年11月20日 10:21 JPEG including an ICC profile
imghdr_icc_jpeg.patch joril, 2012年11月20日 17:17 patch against hg head review
Pull Requests
URL Status Linked Edit
PR 8322 open gov_vj, 2018年07月18日 12:18
PR 14862 open pchopin, 2019年07月19日 14:24
Repositories containing patches
https://bitbucket.org/intgr/cpython
Messages (13)
msg175984 - (view) Author: Joril (joril) Date: 2012年11月20日 10:21
imghdr doesn't support jpegs that include an ICC Profile.
This is because imghdr looks for "JFIF" somewhere at the beginning of the file, but the ICC_PROFILE shifts that further.
(The ICC spec is here http://www.color.org/specification/ICC1v43_2010-12.pdf, annex B)
msg175985 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012年11月20日 11:15
Can you provide a patch?
msg175986 - (view) Author: Joril (joril) Date: 2012年11月20日 11:16
I can try, yes. I'll add one ASAP
msg176009 - (view) Author: Joril (joril) Date: 2012年11月20日 17:17
Here it is... It is against the latest hg version, should I write one for 2.7 too?
msg176010 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012年11月20日 17:31
Thanks for the patch.
> should I write one for 2.7 too?
Not necessary, 2.7 only gets bugs fixes.
OTOH it would be nice to have some tests for this new features (and for the module in general), but there doesn't seem to be any Lib/test/test_imghdr.py file. The module itself seems to contain some kind of tests at the end though.
msg176045 - (view) Author: Joril (joril) Date: 2012年11月21日 08:19
It looks like the test just walks a directory recursively while trying to identify its files, there's no "classic" test of the "this is a JPEG, is it detected correctly"-type
msg184742 - (view) Author: Kovid Goyal (kovid) Date: 2013年03月20日 06:23
The attached patch is insufficient, for example, it fails on http://nationalpostnews.files.wordpress.com/2013/03/budget.jpeg?w=300&h=1571
Note that the linux file utility identifies a files as "JPEG Image data" if the first two bytes of the file are \xff\xd8.
A slightly stricter test that catches more jpeg files:
def test_jpeg(h, f):
 if (h[6:10] in (b'JFIF', b'Exif')) or (h[:2] == b'\xff\xd8' and b'JFIF' in h[:32]):
 return 'jpeg'
msg198034 - (view) Author: (intgr) * Date: 2013年09月18日 20:30
I vote we forget about JFIF/Exif headers and only use \xff\xd8 to identify the file. They are optional and there are tons of files out in the wild without such headers, for example: https://coverartarchive.org/release/5044b557-a9ed-4a74-b763-e20580ced85d/3354872309.jpg
Proposed patch at https://bitbucket.org/intgr/cpython/commits/012cde305316e22a999d674a0a009200d3e76fdb 
msg220345 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年06月12日 12:54
Using \xff\xd8 sounds good to me.
msg220346 - (view) Author: Kovid Goyal (kovid) Date: 2014年06月12日 13:09
FYI, the test I currently use in calibre, which has not failed so far for millions of users:
def test_jpeg(h, f): 
 if (h[6:10] in (b'JFIF', b'Exif')) or (h[:2] == b'\xff\xd8' and (b'JFIF' in h[:32] or b'8BIM' in h[:32])):
 return 'jpeg'
msg220409 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014年06月13日 01:11
Issue 21230 reports a parallel problem with recognizing photoshop images.
We need a patch with tests covering the variant types we know about. I don't have a strong opinion on the simple two byte test versus the more complex test in msg220346, but following 'file' makes sense to me.
msg221185 - (view) Author: Martin Vignali (mvignali) Date: 2014年06月21日 18:44
I'm okay with just testing the first two bytes, it's the method we currently use for our
internal tools.
But maybe it can be interesting, to add another test, in order to detect incomplete file
(created when a camera make a recording error for example, and very useful to detect, because an incomplete jpeg file, make a crash for a lot of application)
We use this patch of imghdr :
--------------------------------------
def test_jpeg(h, f):
 """JPEG data in JFIF or Exif format"""
 if not h.startswith(b'\xff\xd8'):#Test empty files, and incorrect start of file
 return None
 else:
 if f:#if we test a file, test end of jpeg
 f.seek(-2,2)
 if f.read(2).endswith(b'\xff\xd9'):
 return 'jpeg'
 else:#if we just test the header, consider this is a valid jpeg and not test end of file
 return 'jpeg'
-------------------------------------
msg221214 - (view) Author: Kovid Goyal (kovid) Date: 2014年06月22日 03:39
You cannot assume the file like object passed to imghdr is seekable. And IMO it is not the job of imghdr to check file validity, especially since it does not do that for all formats.
History
Date User Action Args
2022年04月11日 14:57:38adminsetgithub: 60716
2019年07月19日 14:24:34pchopinsetpull_requests: + pull_request14651
2018年07月18日 12:18:26gov_vjsetstage: test needed -> patch review
pull_requests: + pull_request7860
2014年09月26日 08:12:37Claudiu.Popasetstage: patch review -> test needed
2014年06月22日 03:39:43kovidsetmessages: + msg221214
2014年06月21日 18:44:22mvignalisetnosy: + mvignali
messages: + msg221185
2014年06月13日 01:11:51r.david.murraysetnosy: + r.david.murray

messages: + msg220409
title: imghdr doesn't support jpegs with an ICC profile -> imghdr doesn't recognize variant jpeg formats
2014年06月13日 01:02:26r.david.murraylinkissue21230 superseder
2014年06月12日 13:09:27kovidsetmessages: + msg220346
2014年06月12日 13:04:30vstinnersetnosy: + vstinner
2014年06月12日 12:54:30Claudiu.Popasetmessages: + msg220345
2014年06月12日 12:44:05Claudiu.Popasetnosy: + Claudiu.Popa
2014年06月12日 12:43:54Claudiu.Popasetversions: + Python 3.5, - Python 3.4
2013年09月18日 20:30:26intgrsetnosy: + intgr

messages: + msg198034
hgrepos: + hgrepo210
2013年03月20日 06:23:29kovidsetnosy: + kovid
messages: + msg184742
2012年11月24日 01:10:26jceasetnosy: + jcea
2012年11月21日 08:19:12jorilsetmessages: + msg176045
2012年11月20日 17:31:24ezio.melottisetstage: needs patch -> patch review
messages: + msg176010
components: + Library (Lib)
versions: + Python 3.4, - Python 2.7
2012年11月20日 17:17:39jorilsetfiles: + imghdr_icc_jpeg.patch
keywords: + patch
messages: + msg176009
2012年11月20日 11:16:16jorilsetmessages: + msg175986
versions: + Python 2.7, - Python 3.4
2012年11月20日 11:15:14ezio.melottisetversions: + Python 3.4, - Python 2.7
nosy: + ezio.melotti

messages: + msg175985

stage: needs patch
2012年11月20日 10:21:20jorilcreate

AltStyle によって変換されたページ (->オリジナル) /