homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add a key parameter (like sorted) to heapq.merge
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Tommy.Carstensen, berker.peksag, eric.araujo, eric.snow, giampaolo.rodola, mark.dickinson, python-dev, rhettinger, serhiy.storchaka, ssapin, stutzbach, terry.reedy
Priority: normal Keywords: needs review, patch

Created on 2012年01月09日 10:35 by ssapin, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
heapq_merge_key.patch ssapin, 2012年01月09日 10:35 'hg diff' output against rev ca2a35140e6a review
benchmark_heapq_merge.py ssapin, 2012年01月09日 10:43 benchmark several implementations
heapq_merge_key_duplicate.patch ssapin, 2012年01月16日 14:58 'hg diff' output against rev ca2a35140e6a review
heap.diff rhettinger, 2013年03月26日 06:54 Rough draft (untested) for a Heap() class review
heap2.diff rhettinger, 2013年04月29日 11:43 Update the draft Heap() class
keymerge.diff rhettinger, 2014年05月27日 01:31 Draft patch without doc updates review
keymerge2.diff rhettinger, 2014年05月29日 08:42 Add both key=func and reverse=True options review
Messages (21)
msg150927 - (view) Author: Simon Sapin (ssapin) Date: 2012年01月09日 10:35
Hi,
The attached patch adds a 'key' optional parameter to the heapq.merge function that behaves as in sorted().
Related discussion: http://mail.python.org/pipermail/python-ideas/2012-January/013295.html
This is my first contribution to CPython.
msg150928 - (view) Author: Simon Sapin (ssapin) Date: 2012年01月09日 10:43
The attached script benchmarks the basline (current implementation) against 3 new implementations, as suggested on http://mail.python.org/pipermail/python-ideas/2012-January/013296.html
On my machine, the output is:
 merge_baseline
 per run, min of 3 = 7.527 ms
 
 merge_1
 per run, min of 3 = 9.894 ms
 131.449 % of baseline
 
 merge_2
 per run, min of 3 = 7.948 ms
 105.594 % of baseline
 
 merge_3
 per run, min of 3 = 7.581 ms
 100.716 % of baseline
On this particular input, merge_2 adds 6% of overhead when the key parameter is not used. While merge_3 only adds 1% of overhead, it almost doubles the amount of code. (Which was admittedly not that long to begin with.)
The patch in the previous message is with the merge_2 implementation, which seemed like the best compromise to me.
msg150931 - (view) Author: Simon Sapin (ssapin) Date: 2012年01月09日 11:10
Oops, the patch to the documentation would also need 'New in 3.3: the key parameter', with the right Sphinx directive. But that depends on whether this change ends up in 3.3 or 3.4.
Does 3.3 still get new features?
msg150954 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012年01月09日 16:53
Yes, 3.3 is still in the early development stage, and new features will be accepted until the first beta (in June, see PEP 398). ".. versionadded:: 3.3 The *key* parameter" will do.
msg150969 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012年01月09日 19:33
Simon, please keep the original version fast by creating two code paths:
 if key is None:
 original_code
 else:
 new_code using the key_function
msg150983 - (view) Author: Simon Sapin (ssapin) Date: 2012年01月09日 22:13
Raymond, please have a look at merge_3 in benchmark_heapq_merge.py. It is implemented as you say.
Do you think the speed is worth the code duplication?
msg151369 - (view) Author: Simon Sapin (ssapin) Date: 2012年01月16日 14:58
heapq_merge_key_duplicate.patch is a new patch with two code path. It also updates the function’s docstring (which the previous patch did not). Raymond, do you think the speed is worth the DRY violation?
msg152802 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012年02月07日 04:11
I'll look at this in the next couple of weeks. Hang tight :-)
msg152984 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012年02月09日 20:39
FWIW, Guido approves of the idea, msg152969 in #4356 
msg168070 - (view) Author: Simon Sapin (ssapin) Date: 2012年08月13日 07:51
I just remembered about this. I suppose it is too late for 3.3?
msg168116 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012年08月13日 15:20
Yes, 3.3 is already in beta.
msg185259 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013年03月26日 06:54
Attaching a rough draft implementation for a fully encapsulated Heap() class that is thread-safe, supports minheaps and maxheaps, and efficiently implements key-functions (called no more than once per key).
msg188066 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013年04月29日 12:14
heap2.diff contains only a single line's change. Wrong file attached?
msg188067 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013年04月29日 12:14
Ah, I see the new file now (I'd failed to refresh my browser); sorry for the noise.
msg188080 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013年04月29日 17:51
Looks pretty good to me.
- There's a bonus print call in the diff.
- Should the "len(self._data)" call be protected by the lock? I can't immediately think of any reason why that would be necessary (e.g., pushpop nd poppush never change the size of self._data, so there's no risk of getting a bogus length there), but the lack of the lock makes me nervous.
- Support for iter() seems a bit out of place to me. What are the use-cases for this? Would it make sense to leave this out (for now)?
msg188085 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年04月29日 18:55
There is already one heap class in the stdlib: queue.PriorityQueue. Why create a duplicate instead extend queue.PriorityQueue with desired features?
May be name the maxheap parameter as reverse?
msg219380 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014年05月30日 09:28
New changeset f5521f5dec4a by Raymond Hettinger in branch 'default':
Issue #13742: Add key and reverse parameters to heapq.merge()
http://hg.python.org/cpython/rev/f5521f5dec4a 
msg233792 - (view) Author: Tommy Carstensen (Tommy.Carstensen) Date: 2015年01月10日 01:19
I noticed 3.5 alpha1 is not released until February 1st. Is there any way I can get my hands on this new functionality?
msg233793 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2015年01月10日 01:24
Hi Tommy, the patch is already committed to Python 3.5. See https://docs.python.org/3.5/library/heapq.html#heapq.merge 
msg233795 - (view) Author: Tommy Carstensen (Tommy.Carstensen) Date: 2015年01月10日 01:55
Yes, but 3.5 has not been pre-released yet.
msg233804 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015年01月10日 08:47
You can set up mecurial on your machine, make a read-only clone of the cpython repository, and compile it just as do other people, whether core-developers or otherwise. See docs.python.org/devguide for details.
History
Date User Action Args
2022年04月11日 14:57:25adminsetgithub: 57951
2015年01月10日 08:47:47terry.reedysetmessages: + msg233804
2015年01月10日 01:55:18Tommy.Carstensensetmessages: + msg233795
2015年01月10日 01:24:07berker.peksagsetnosy: + berker.peksag

messages: + msg233793
stage: patch review -> resolved
2015年01月10日 01:19:39Tommy.Carstensensetnosy: + Tommy.Carstensen
messages: + msg233792
2014年05月30日 09:29:28rhettingersetstatus: open -> closed
resolution: fixed
2014年05月30日 09:28:45python-devsetnosy: + python-dev
messages: + msg219380
2014年05月29日 08:42:18rhettingersetfiles: + keymerge2.diff
2014年05月29日 08:42:00rhettingersetfiles: - keymerge2.diff
2014年05月29日 08:32:46rhettingersetfiles: + keymerge2.diff
2014年05月27日 01:31:45rhettingersetfiles: + keymerge.diff
2014年05月27日 01:30:59rhettingersetversions: + Python 3.5, - Python 3.4
2014年05月12日 17:57:43eric.snowsetnosy: + eric.snow
2013年04月29日 18:55:40serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg188085
2013年04月29日 17:51:35mark.dickinsonsetmessages: + msg188080
2013年04月29日 12:14:41mark.dickinsonsetmessages: + msg188067
2013年04月29日 12:14:03mark.dickinsonsetnosy: + mark.dickinson
messages: + msg188066
2013年04月29日 11:43:52rhettingersetfiles: - heap2.diff
2013年04月29日 11:43:35rhettingersetfiles: + heap2.diff
2013年04月29日 10:54:52rhettingersetpriority: low -> normal
files: + heap2.diff
2013年03月26日 06:54:31rhettingersetfiles: + heap.diff

messages: + msg185259
2012年08月13日 15:20:10eric.araujosetkeywords: + needs review

stage: patch review
messages: + msg168116
versions: + Python 3.4, - Python 3.3
2012年08月13日 07:51:59ssapinsetmessages: + msg168070
2012年02月09日 20:39:38terry.reedysetnosy: + terry.reedy
messages: + msg152984
2012年02月07日 08:44:03giampaolo.rodolasetnosy: + giampaolo.rodola
2012年02月07日 04:11:24rhettingersetmessages: + msg152802
2012年01月16日 14:58:54ssapinsetfiles: + heapq_merge_key_duplicate.patch

messages: + msg151369
2012年01月09日 22:13:45ssapinsetmessages: + msg150983
2012年01月09日 19:33:36rhettingersetpriority: normal -> low

messages: + msg150969
2012年01月09日 18:44:52stutzbachsetnosy: + stutzbach
2012年01月09日 16:53:30eric.araujosetnosy: + eric.araujo

messages: + msg150954
versions: - Python 3.4
2012年01月09日 11:10:14ssapinsetmessages: + msg150931
2012年01月09日 10:51:00eric.smithsetassignee: rhettinger
2012年01月09日 10:43:36ssapinsetfiles: + benchmark_heapq_merge.py

messages: + msg150928
2012年01月09日 10:35:41ssapincreate

AltStyle によって変換されたページ (->オリジナル) /