Introdution to the Python Interpreter

看到一篇对Python解释器讲解的很好的文章Introduction to the Python Interpreter,

特此加上自己理解, 重新梳理一遍

interpreter 的作用

1
2
3
4
5
Python 2.7.10 (default, Feb  7 2017, 00:08:15)
Type "copyright", "credits" or "license" for more information.

In [1]: print('Hello World')
Hello World

当在shell环境中输入print('Hello World') 并按下回车键, 到输出Hello World, 中间经历了哪些过程?

当按下回车键后, 会进行lexing(词法分析) –> parsing(语法分析) –> compiling(编译) -> interpreting(解释) 这四步骤

编译会生成PyCodeObject对象, 存储在后缀名为pyc的文件中, interpreter 会解释这些PyCodeObject对象, 并将得到的结果输出

.pyc 可以类比为Java的.class文件, 其都包含有bytecode(字节码)和一些其他相关的信息(当前的上下文等)

.pyc文件在Python的interpreter(比如Cpython)中执行

.class在Java的interpreter(JVM)中执行

Python与Java的差别在于Python的interpreter更高级, Java更低级

是不是可以理解C/C++的interpreter更低级, 其在linux上就可以执行, 无需在操作系统上加一层vm?

function object

1
2
3
4
5
6
>>> def foo(a):
... x = 3
... return x + a
...
>>> foo
<function foo at 0x107ef7aa0>

“Functions are first-class objects,” means that function are objects, like a list is an object or an instance of MyObject is an object.

function 是一个对象, 与int, list这些对象相同, 它是对象所以被称为first-class objects!!

1
2
>>> foo.func_code
<code object foo at 0x107eeccb0, file "<stdin>", line 1>

foo.func_code 是一个PyCodeObject对象, 即当要执行一个函数时, 实际上是执行(解释)其foo.func_code

PyCodeObject

1
2
3
4
5
6
7
>>> dir(foo.func_code)
['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__',
'__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename',
'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals',
'co_stacksize', 'co_varnames']

PyCodeObject有以上几个属性, 其中包含有bytecode(字节码)和执行字节码所需的参数等信息

1
2
3
4
5
6
7
8
>>> foo.func_code.co_varnames
('a', 'x')
>>> foo.func_code.co_consts
(None, 3)
>>> foo.func_code.co_argcount
1
>>> foo.func_code.co_code
'd\x01\x00}\x01\x00|\x01\x00|\x00\x00\x17S'
参数名 作用
co_varnames 变量名
co_consts 常量值
co_argcount 参数数量
co_code bytecode(执行这个函数需要执行的指令集合)

Bytecode

1
2
>>> [ord(b) for b in foo.func_code.co_code]
[100, 1, 0, 125, 1, 0, 124, 1, 0, 124, 0, 0, 23, 83]

上面每个数字,都表示一个指令(introduction), 这些指令可以在interpreter中执行, 具体指令对照表,在文末会贴出

这里有每个指令的作用Python Bytecode Instructions

The interpreter will loop through each byte, look up what it should do for each one, and then do that thing

下面看下详细的指令流程

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> def foo(a):
... x = 3
... return x + a
...
>>> import dis
>>> dis.dis(foo.func_code)
2 0 LOAD_CONST 1 (3)
3 STORE_FAST 1 (x)

3 6 LOAD_FAST 1 (x)
9 LOAD_FAST 0 (a)
12 BINARY_ADD
13 RETURN_VALUE

左起,

  • 第一列表示指令所对应的源代码的行数
  • 第二列表示指令在foo.func_code.co_code列表中的索引
  • 第三列表示执行的名称

最后两列描述执行指令所需的参数

  • 第四列表示参数的索引(不同的指令, 其索引对象可能是特定的PyCodeObject属性)
  • 第五列表示参数索引所代表的值

源代码第二行是x = 3, 对应的bytecode是 LOAD_CONSTSTORE_FAST,

LOAD_CONST的指令= foo.func_code.co_code[0]

STORE_FAST的指令= foo.func_code.co_code[3]

下面看下LOAD_CONST 具体所做的事情是,

LOAD_CONST(consti)

Pushes co_consts[consti] onto the stack.

LOAD_CONST 指令将co_consts[consti]的值 push 到栈顶, 在上面例子中consti值为1, co_consts[consti] 值为3

define STOP_CODE 0

字节码为0的标识指令执行到这里结束

可以看到 [100, 1, 0, 125, 1, 0, 124, 1, 0, 124, 0, 0, 23, 83]

  • 100 表示LOAD_CONST
  • 1 表示指定参数, 即consti = 1
  • 0 表示到此, LOAD_CONST 指令执行完毕

下面看看STORE_FAST指令:

STORE_FAST(var_num)

Stores TOS(top of stack) into the local co_varnames[var_num].

将栈顶元素pop到co_varnames[var_num], 以此类推, 与LOAD_CONST一致

指令所对应的编码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
// Include/opcode.h

/* Instruction opcodes for compiled code */

#define STOP_CODE 0
#define POP_TOP 1
#define ROT_TWO 2
#define ROT_THREE 3
#define DUP_TOP 4
#define ROT_FOUR 5
#define NOP 9

#define UNARY_POSITIVE 10
#define UNARY_NEGATIVE 11
#define UNARY_NOT 12
#define UNARY_CONVERT 13

#define UNARY_INVERT 15

#define LIST_APPEND 18
#define BINARY_POWER 19

#define BINARY_MULTIPLY 20
#define BINARY_DIVIDE 21
#define BINARY_MODULO 22
#define BINARY_ADD 23
#define BINARY_SUBTRACT 24
#define BINARY_SUBSCR 25
#define BINARY_FLOOR_DIVIDE 26
#define BINARY_TRUE_DIVIDE 27
#define INPLACE_FLOOR_DIVIDE 28
#define INPLACE_TRUE_DIVIDE 29

#define SLICE 30
/* Also uses 31-33 */

#define STORE_SLICE 40
/* Also uses 41-43 */

#define DELETE_SLICE 50
/* Also uses 51-53 */

#define INPLACE_ADD 55
#define INPLACE_SUBTRACT 56
#define INPLACE_MULTIPLY 57
#define INPLACE_DIVIDE 58
#define INPLACE_MODULO 59
#define STORE_SUBSCR 60
#define DELETE_SUBSCR 61

#define BINARY_LSHIFT 62
#define BINARY_RSHIFT 63
#define BINARY_AND 64
#define BINARY_XOR 65
#define BINARY_OR 66
#define INPLACE_POWER 67
#define GET_ITER 68

#define PRINT_EXPR 70
#define PRINT_ITEM 71
#define PRINT_NEWLINE 72
#define PRINT_ITEM_TO 73
#define PRINT_NEWLINE_TO 74
#define INPLACE_LSHIFT 75
#define INPLACE_RSHIFT 76
#define INPLACE_AND 77
#define INPLACE_XOR 78
#define INPLACE_OR 79
#define BREAK_LOOP 80
#define WITH_CLEANUP 81
#define LOAD_LOCALS 82
#define RETURN_VALUE 83
#define IMPORT_STAR 84
#define EXEC_STMT 85
#define YIELD_VALUE 86
#define POP_BLOCK 87
#define END_FINALLY 88
#define BUILD_CLASS 89

#define HAVE_ARGUMENT 90 /* Opcodes from here have an argument: */

#define STORE_NAME 90 /* Index in name list */
#define DELETE_NAME 91 /* "" */
#define UNPACK_SEQUENCE 92 /* Number of sequence items */
#define FOR_ITER 93

#define STORE_ATTR 95 /* Index in name list */
#define DELETE_ATTR 96 /* "" */
#define STORE_GLOBAL 97 /* "" */
#define DELETE_GLOBAL 98 /* "" */
#define DUP_TOPX 99 /* number of items to duplicate */
#define LOAD_CONST 100 /* Index in const list */
#define LOAD_NAME 101 /* Index in name list */
#define BUILD_TUPLE 102 /* Number of tuple items */
#define BUILD_LIST 103 /* Number of list items */
#define BUILD_MAP 104 /* Always zero for now */
#define LOAD_ATTR 105 /* Index in name list */
#define COMPARE_OP 106 /* Comparison operator */
#define IMPORT_NAME 107 /* Index in name list */
#define IMPORT_FROM 108 /* Index in name list */

#define JUMP_FORWARD 110 /* Number of bytes to skip */
#define JUMP_IF_FALSE 111 /* "" */
#define JUMP_IF_TRUE 112 /* "" */
#define JUMP_ABSOLUTE 113 /* Target byte offset from beginning of code */

#define LOAD_GLOBAL 116 /* Index in name list */

#define CONTINUE_LOOP 119 /* Start of loop (absolute) */
#define SETUP_LOOP 120 /* Target address (absolute) */
#define SETUP_EXCEPT 121 /* "" */
#define SETUP_FINALLY 122 /* "" */

#define LOAD_FAST 124 /* Local variable number */
#define STORE_FAST 125 /* Local variable number */
#define DELETE_FAST 126 /* Local variable number */

#define RAISE_VARARGS 130 /* Number of raise arguments (1, 2 or 3) */
/* CALL_FUNCTION_XXX opcodes defined below depend on this definition */
#define CALL_FUNCTION 131 /* #args + (#kwargs<<8) */
#define MAKE_FUNCTION 132 /* #defaults */
#define BUILD_SLICE 133 /* Number of items */

#define MAKE_CLOSURE 134 /* #free vars */
#define LOAD_CLOSURE 135 /* Load free variable from closure */
#define LOAD_DEREF 136 /* Load and dereference from closure cell */
#define STORE_DEREF 137 /* Store into cell */

本文标题:Introdution to the Python Interpreter

文章作者:定。

发布时间:2019年8月31日 - 22时08分

本文字数:5,179字

原始链接:http://cocofe.cn/2019/08/31/introdution-to-the-python-interpreter/

许可协议: Attribution-NonCommercial 4.0

转载请保留以上信息。