Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit d6f6bfc

Browse files
committed
update
1 parent f9c9f6d commit d6f6bfc

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+3075
-1070
lines changed

‎README.md

Lines changed: 5 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,70 +1,12 @@
1-
# Python 源码剖析》学习笔记
1+
# Python 3.9 源代码阅读笔记
22

3-
> 《Python 源码剖析》
4-
> 作者:陈儒 Robert Chen
5-
> 出版年份:2008 年
6-
> Python 版本:2.5
3+
之前看了 "Python 源码剖析", 这本书是基于 Python 2.5 的, 现在的 Python 已经发生了很大的改变. 因此, 在这里记录下阅读 Python 3.9 的源代码的笔记, 当然阅读内容主要是 Python3 某些新加的特性.
74

8-
在阅读《Python 源码剖析》的过程中记录的一些笔记,不是特别详细,简单记录了一些关键的地方,方便以后查看。
5+
-[Python 虚拟机](ceval.md)
96

10-
## 编译代码
7+
-[GIL](gil.md)
118

12-
使用 Docker 编译 Python 源代码,使用说明参考 [Docker 使用说明](docker.md)
13-
14-
## 源代码
15-
16-
在阅读《Python 源码剖析》过程中,为了验证一些想法,对 Python2.5的源代码进行了不少修改。修改过的代码在[这里](https://github.com/ausaki/python25)
17-
18-
master 分支是原始代码。
19-
20-
每个 chxx 分支对应书中相应的章节,基于 master 分支修改而来。
21-
22-
## 其它资源
23-
24-
- [作者在 CSDN 的博客](https://blog.csdn.net/balabalamerobert)(不再更新)。
25-
26-
- [Extending and Embedding the Python Interpreter](https://docs.python.org/2.7/extending/index.html)
27-
28-
扩展和嵌入 Python 解析器,介绍了如何用 C/C++ 编写 Python 的扩展模块,如何在其它语言中嵌入 Python 解释器。
29-
30-
- [C API](https://docs.python.org/2.7/c-api/index.html)
31-
32-
详细介绍了 Python 内部的 C API。
33-
34-
- [Python Developer’s Guide](https://devguide.python.org/)
35-
36-
Python 开发者指南。
37-
38-
39-
40-
## 目录
41-
42-
- 第一部分
43-
44-
- [ch01 - Pyhton 对象初探](ch01.md)
45-
- [ch02 - Pyhton 中的整数对象](ch02.md)
46-
- [ch03 - Pyhton 中的字符串对象](ch03.md)
47-
- [ch04 - Python 中的 List 对象](ch04.md)
48-
- [ch05 - Python 中的 Dict 对象](ch05.md)
49-
- [ch06 - 最简单的Python模拟——Small Python](ch06.md)
50-
51-
- 第二部分
52-
53-
- [ch07 - Python的编译结果——Code对象与pyc文件](ch07.md)
54-
- [ch08 - Python 虚拟机框架](ch08.md)
55-
- [ch09 - Python虚拟机中的一般表达式](ch09.md)
56-
- [ch010 - Python虚拟机中的控制流](ch10.md)
57-
- [ch011 - Python虚拟机中的函数机制](ch11.md)
58-
- [ch012 - Python虚拟机中的类机制](ch12.md)
59-
- [ch013 - Python运行环境初始化](ch13.md)
60-
- [ch014 - Python模块的动态加载机制](ch14.md)
61-
- [ch015 - Python多线程机制](ch15.md)
62-
- [ch016 - Python的内存管理机制](ch16.md)
63-
64-
65-
## THE END
66-
67-
大概花了一个月的时间(2018年8月14日 ~ 2018年9月13日 )看完本书,收获颇多,初步了解了Python 的底层细节,也增加了阅读源码的信心。《Python 源码剖析》这本书没有办法把 Python 源码的各个方面都介绍到,自己有时间的话还应该多阅读源码。
689

6910

11+
[源码注释分支](https://github.com/ausaki/python)
7012

‎ch15.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,4 +32,3 @@ Python 中的线程是对操作系统的原生线程的封装,具体实现在
3232

3333
当持有GIL的线程发现标志后,会释放掉 GIL。
3434

35-

‎code 对象.md

Lines changed: 242 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,242 @@
1+
# code 对象
2+
3+
```c
4+
struct PyCodeObject {
5+
PyObject_HEAD
6+
int co_argcount; /* #arguments, except *args */
7+
int co_posonlyargcount; /* #positional only arguments */
8+
int co_kwonlyargcount; /* #keyword only arguments */
9+
int co_nlocals; /* #local variables */
10+
int co_stacksize; /* #entries needed for evaluation stack */
11+
int co_flags; /* CO_..., see below */
12+
int co_firstlineno; /* first source line number */
13+
PyObject *co_code; /* instruction opcodes */
14+
PyObject *co_consts; /* list (constants used) */
15+
PyObject *co_names; /* list of strings (names used) */
16+
PyObject *co_varnames; /* tuple of strings (local variable names) */
17+
PyObject *co_freevars; /* tuple of strings (free variable names) */
18+
PyObject *co_cellvars; /* tuple of strings (cell variable names) */
19+
/* The rest aren't used in either hash or comparisons, except for co_name,
20+
used in both. This is done to preserve the name and line number
21+
for tracebacks and debuggers; otherwise, constant de-duplication
22+
would collapse identical functions/lambdas defined on different lines.
23+
*/
24+
Py_ssize_t *co_cell2arg; /* Maps cell vars which are arguments. */
25+
PyObject *co_filename; /* unicode (where it was loaded from) */
26+
PyObject *co_name; /* unicode (name, for reference) */
27+
PyObject *co_lnotab; /* string (encoding addr<->lineno mapping) See
28+
Objects/lnotab_notes.txt for details. */
29+
void *co_zombieframe; /* for optimization only (see frameobject.c) */
30+
PyObject *co_weakreflist; /* to support weakrefs to code objects */
31+
/* Scratch space for extra data relating to the code object.
32+
Type is a void* to keep the format private in codeobject.c to force
33+
people to go through the proper APIs. */
34+
void *co_extra;
35+
36+
/* Per opcodes just-in-time cache
37+
*
38+
* To reduce cache size, we use indirect mapping from opcode index to
39+
* cache object:
40+
* cache = co_opcache[co_opcache_map[next_instr - first_instr] - 1]
41+
*/
42+
43+
// co_opcache_map is indexed by (next_instr - first_instr).
44+
// * 0 means there is no cache for this opcode.
45+
// * n > 0 means there is cache in co_opcache[n-1].
46+
unsigned char *co_opcache_map;
47+
_PyOpcache *co_opcache;
48+
int co_opcache_flag; // used to determine when create a cache.
49+
unsigned char co_opcache_size; // length of co_opcache.
50+
};
51+
```
52+
53+
54+
## 字节码缓存(opcache)
55+
56+
注意到 PyCodeObject 对象中有一个 co_opcache 属性, 似乎支持字节码缓存, 查看了其它代码发现字节码缓存功能目前只支持 LOAD_GLOBALS.
57+
58+
字节码缓存的基本原理是保存字节码执行的结果, 当再次执行该字节码可以直接返回缓存的结果, 从而提高字节码的执行效率.
59+
60+
从定义 PyCodeObject 的结构体的代码注释中可以看出字节码缓存的实现原理, co_opcache_map 是一个 char 类型的数组, 索引是字节码的偏移量(`offset = next_instr - first_instr`), 如果 `co_opcache_map[offset]` 等于 0 说明该字节码没有缓存, 如果大于 0, 说明该字节码的缓存保存在 `co_opcache[co_opcache_map[offset]]`.
61+
62+
co_opcache 是一个 _PyOpcache 类型的数组, 代码如下:
63+
64+
```c
65+
typedef struct {
66+
PyObject *ptr; /* Cached pointer (borrowed reference) */
67+
uint64_t globals_ver; /* ma_version of global dict */
68+
uint64_t builtins_ver; /* ma_version of builtin dict */
69+
} _PyOpcache_LoadGlobal;
70+
71+
struct _PyOpcache {
72+
union {
73+
_PyOpcache_LoadGlobal lg;
74+
} u;
75+
char optimized;
76+
};
77+
```
78+
79+
`_PyOpcache_LoadGlobal.ptr` 指向缓存的数据, `_PyOpcache_LoadGlobal.globals_ver` 表示缓存数据时 globals(全局变量字典) 的版本, `_PyOpcache_LoadGlobal.builtins_ver` 表示缓存数据时 builtins 的版本.
80+
81+
字典类型内部有一个版本字段 `ma_version_tag`, 每次字典被修改时, 都会增加版本字段. 代码如下:
82+
83+
```c
84+
/*Global counter used to set ma_version_tag field of dictionary.
85+
* It is incremented each time that a dictionary is created and each
86+
* time that a dictionary is modified. */
87+
static uint64_t pydict_global_version = 0;
88+
89+
#define DICT_NEXT_VERSION() (++pydict_global_version)
90+
```
91+
92+
关于 `ma_version_tag` 的更多信息可以查看 [PEP 509 -- Add a private version to dict](https://www.python.org/dev/peps/pep-0509/).
93+
94+
当执行 `LOAD_GLOBAL` 时, 如果缓存存在并且缓存的版本号和当前版本号一致, 那么直接返回缓存的数据.
95+
96+
### 初始化 opcache
97+
98+
```c
99+
int
100+
_PyCode_InitOpcache(PyCodeObject *co)
101+
{
102+
Py_ssize_t co_size = PyBytes_Size(co->co_code) / sizeof(_Py_CODEUNIT);
103+
co->co_opcache_map = (unsigned char *)PyMem_Calloc(co_size, 1);
104+
if (co->co_opcache_map == NULL) {
105+
return -1;
106+
}
107+
108+
_Py_CODEUNIT *opcodes = (_Py_CODEUNIT*)PyBytes_AS_STRING(co->co_code);
109+
Py_ssize_t opts = 0;
110+
111+
for (Py_ssize_t i = 0; i < co_size;) {
112+
unsigned char opcode = _Py_OPCODE(opcodes[i]);
113+
i++; // 'i' is now aligned to (next_instr - first_instr)
114+
115+
// TODO: LOAD_METHOD, LOAD_ATTR
116+
if (opcode == LOAD_GLOBAL) {
117+
opts++;
118+
co->co_opcache_map[i] = (unsigned char)opts;
119+
if (opts > 254) {
120+
break;
121+
}
122+
}
123+
}
124+
125+
if (opts) {
126+
co->co_opcache = (_PyOpcache *)PyMem_Calloc(opts, sizeof(_PyOpcache));
127+
if (co->co_opcache == NULL) {
128+
PyMem_FREE(co->co_opcache_map);
129+
return -1;
130+
}
131+
}
132+
else {
133+
PyMem_FREE(co->co_opcache_map);
134+
co->co_opcache_map = NULL;
135+
co->co_opcache = NULL;
136+
}
137+
138+
co->co_opcache_size = (unsigned char)opts;
139+
return 0;
140+
}
141+
```
142+
143+
### LOAD_GLOBAL 检查 opcache
144+
145+
```c
146+
case TARGET(LOAD_GLOBAL): {
147+
PyObject *name;
148+
PyObject *v;
149+
if (PyDict_CheckExact(f->f_globals)
150+
&& PyDict_CheckExact(f->f_builtins))
151+
{
152+
OPCACHE_CHECK();
153+
if (co_opcache != NULL && co_opcache->optimized > 0) {
154+
_PyOpcache_LoadGlobal *lg = &co_opcache->u.lg;
155+
156+
if (lg->globals_ver ==
157+
((PyDictObject *)f->f_globals)->ma_version_tag
158+
&& lg->builtins_ver ==
159+
((PyDictObject *)f->f_builtins)->ma_version_tag)
160+
{
161+
PyObject *ptr = lg->ptr;
162+
OPCACHE_STAT_GLOBAL_HIT();
163+
assert(ptr != NULL);
164+
Py_INCREF(ptr);
165+
PUSH(ptr);
166+
DISPATCH();
167+
}
168+
}
169+
170+
name = GETITEM(names, oparg);
171+
v = _PyDict_LoadGlobal((PyDictObject *)f->f_globals,
172+
(PyDictObject *)f->f_builtins,
173+
name);
174+
if (v == NULL) {
175+
if (!_PyErr_OCCURRED()) {
176+
/* _PyDict_LoadGlobal() returns NULL without raising
177+
* an exception if the key doesn't exist */
178+
format_exc_check_arg(tstate, PyExc_NameError,
179+
NAME_ERROR_MSG, name);
180+
}
181+
goto error;
182+
}
183+
184+
if (co_opcache != NULL) {
185+
_PyOpcache_LoadGlobal *lg = &co_opcache->u.lg;
186+
187+
if (co_opcache->optimized == 0) {
188+
/* Wasn't optimized before. */
189+
OPCACHE_STAT_GLOBAL_OPT();
190+
} else {
191+
OPCACHE_STAT_GLOBAL_MISS();
192+
}
193+
194+
co_opcache->optimized = 1;
195+
lg->globals_ver =
196+
((PyDictObject *)f->f_globals)->ma_version_tag;
197+
lg->builtins_ver =
198+
((PyDictObject *)f->f_builtins)->ma_version_tag;
199+
lg->ptr = v; /* borrowed */
200+
}
201+
202+
Py_INCREF(v);
203+
}
204+
```
205+
206+
网上搜 "Python opcache" 发现都是关于 PHP 的, 唯一比较有用的信息是一个 [issue](https://bugs.python.org/issue26219). 这个 issue 在 2016 年提出, 2019 年才合并到 python 3.8. 到目前为止只支持 LOAD_GLOBAL, 未来应该会支持 LOAD_ATTR 和 LOAD_METHOD.
207+
208+
突然想到一个手动优化读取全局变量的性能的方法, 在函数内使用一个局部变量保存全局变量的引用, 然后在之后代码都使用该局部变量. 这招对于比较长的属性访问也有帮助, 例如 `foo = obj.a.b.c.d` 可以提高属性访问的速度.
209+
210+
一个例子:
211+
212+
```py
213+
class A:
214+
def __init__(self) -> None:
215+
self.a = 1
216+
217+
class B:
218+
def __init__(self) -> None:
219+
self.a = A()
220+
221+
class C:
222+
def __init__(self) -> None:
223+
self.b = B()
224+
225+
c = C()
226+
print(c.b.a.a)
227+
```
228+
229+
属性访问的字节码:
230+
231+
```
232+
15 48 LOAD_NAME 4 (print)
233+
50 LOAD_NAME 3 (c)
234+
52 LOAD_ATTR 5 (b)
235+
54 LOAD_ATTR 6 (a)
236+
56 LOAD_ATTR 6 (a)
237+
58 CALL_FUNCTION 1
238+
```
239+
240+
241+
242+

‎codes/README.md

Lines changed: 0 additions & 9 deletions
This file was deleted.

‎codes/python_scripts/ch09/simple_obj.py

Lines changed: 0 additions & 4 deletions
This file was deleted.
-170 Bytes
Binary file not shown.

‎codes/python_scripts/ch10/for_control.py

Lines changed: 0 additions & 3 deletions
This file was deleted.

‎codes/python_scripts/ch10/if_control.py

Lines changed: 0 additions & 11 deletions
This file was deleted.

‎codes/python_scripts/ch10/while_control.py

Lines changed: 0 additions & 8 deletions
This file was deleted.

‎codes/python_scripts/ch11/func_00.py

Lines changed: 0 additions & 4 deletions
This file was deleted.

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /