On Sun, Jun 3, 2012 at 3:06 PM, Calvin Spealman <span dir="ltr">&lt;<a href="mailto:ironfroggy@gmail.com" target="_blank">ironfroggy@gmail.com</a>&gt;</span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im HOEnZb">On Sun, Jun 3, 2012 at 7:49 AM, Maciej Fijalkowski &lt;<a href="mailto:fijall@gmail.com">fijall@gmail.com</a>&gt; wrote:<br>
</div><div class="HOEnZb"><div class="h5">&gt; Hi<br>
&gt;<br>
&gt; I was reading a bit about the regex module and I would like to present some<br>
&gt; other solution into speeding up the re module for Python.<br>
&gt;<br>
&gt; So, as a bit of background - pypy has a re compatible module. It&#39;s also<br>
&gt; JITted and it&#39;s also exportable as a C library (that is a library you can<br>
&gt; call from C with C API, not a python extension module). I wonder if it would<br>
&gt; be worth to put some work into it to make it a library that CPython can use.<br>
&gt;<br>
&gt; On the minus side, the JIT only works on x86 and x86_64, on the plus side,<br>
&gt; since it&#39;s 100% API compatible, it can be used as a _xxx speedup module<br>
&gt; relatively easy.<br>
&gt;<br>
&gt; Do people have opinions?<br>
<br>
</div></div><div class="HOEnZb"><div class="h5">A few questions and comments about such an idea, from someone who<br>
hasn&#39;t used PyPy yet and doesn&#39;t understand the setup involved.<br>
<br>
1) Would PyPy be required to build this as a C-compatible library,<br>
such that CPython could use it as an extension module? That is, would<br>
it make PyPy a required part of building CPython?<br></div></div></blockquote><div><br></div><div>It depends a bit how we organize stuff. PyPy (as the pypy repository checkout, not the pypy interpreter) would be requires to build necessary C files (and as such also maintenance since the C files are not hand-editable), but pypy would not be required to compile C files.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">
<br>
2) Are there benchmarks comparing the performance of this<br>
implementation to the existing re module and the proposed regex<br>
module?<br></div></div></blockquote><div><br></div><div>I don&#39;t think so. It really is reasonably fast in a lot of cases and it can definitely be made faster in more cases. The main power comes from JITting - so you compile a piece of assembler per regex created. I doubt C library can come close to this approach-wise. Of course there will be cases and cases, but generally speaking the approach is superior. It would be cool if someone do the benchmarks how they look like *right now*.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">
<br>
3) How would the maintenance work? Where would the module live<br>
&quot;officially&quot;? Does CPython fork it or is it extracted from PyPy in a<br>
way it can be installed as an external dependency? How does CPython<br>
get changes upstream?<br></div></div></blockquote><div><br></div><div>I would honestly hope it can be maintained as a part of pypy and then cpython would just use it. But those are just hopes.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="HOEnZb"><div class="h5">
<br>
4) I may be remembering wrong, but I recall maintenance ease to be one<br>
of the justifications for the regex module. How would your proposal<br>
compare? Is a random developer looking to fix a bug in his way going<br>
to find this easier or more difficult to get his head around?<br></div></div></blockquote><div><br></div><div>I think it&#39;s relatively easy since it&#39;s python code after all, but what do I know. Someone has to have a look, it lives here - <a href="https://bitbucket.org/pypy/pypy/src/default/pypy/rlib/rsre">https://bitbucket.org/pypy/pypy/src/default/pypy/rlib/rsre</a> I would like people to have opinions themselves whether it&#39;s more or less maintenance effort. On our side, we&#39;ll maintain this particular part of code anyway (so it&#39;s also easier because you leave it to others).</div>

<div> </div><div>Cheers,</div><div>fijal</div></div>

AltStyle によって変換されたページ (->オリジナル) /