Python监控引用计数
Python #引用计数2012-11-14 22:44
问题缘起于python-cn邮件列表的一个问题:http://groups.google.it/group/python-cn/browse_thread/thread/758891b4342eb2d9/92c12bf6acd667ac
有趣的是,为什么在Python2.4中sys.getrefcount(11111111)的结果是2,到了Python2.5中却摇身一变,变成了 3?更令人惊奇的是,如果你在Python2.5的IDLE中运行sys.getrefcount(111111),你会惊奇地发现,好了,现在的结果又 变成2了。我们知道sys.getrefcount输出的是一个对象的引用计数,为什么相同的代码,相同的对象,在不同的运行环境中的引用计数却不同了? 原因是,我们有一个致命的疏漏。
在考察sys.getrefcount时,我们只看到了它的运行结果,但是实际上在运行的背后还有一个更加重要的幕后推手——编译。为了确认编译是否对对象的引用计数产生影响,我们来考察几个例子:
1、交互式环境下:
Python 2.5 (r25:51908, May 27 2007, 09:33:26) [MSC v.1310 32 bit (Inte
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getrefcount(11111111)
3
2、文件方式执行 :
[ref.py]
import sys
print sys.getrefcount(11111111)
结果如下:
F:/compile/Python-2.5/PCbuild>python ref.py
3
3、文件方式执行(避开编译影响)
[demo.py]
import ref
结果如下:
F:/compile/Python-2.5/PCbuild>python demo.py
2
由此可见,实际上多余的引用是由Python的编译过程贡献的,在1、2两种执行方式下,Python都会在开始时激活一个编译的动作;而在执行方式3下,import机制会导致产生一个ref.pyc文件,所以不会在每次执行时都会激活编译动作。
但是,但是,我们看到在Python2.5的交互式环境中和IDLE环境中,都需要进行编译,而输出的结果却不同,这又是什么原因呢?嗯,这就需要从Python的源码中找原因了。我们先来看看sys.getrefcount到底输出了个什么玩意儿:
[sysmodule.c]
static PyObject *
sys_getrefcount(PyObject *self, PyObject *arg)
{
if(arg != NULL && PyInt_Check(arg)) {
if(PyInt_AsLong(arg) == 11111111) {
printf("in sys_getrefcount ");
}
}
return PyInt_FromSsize_t(arg->ob_refcnt);
}
原来输出的东西就是PyObject中的ob_refcnt值,为了真正搞清楚引用计数的变化情况,我们就来修改Python源代码,对每一次引用计数的变化进行监控,这需要修改到Python中用于改变引用计数的两个宏:
[object.h]
#define Py_INCREF(op) (
robertincref((PyObject*)(op)) ,
(op)->ob_refcnt++)
#define Py_DECREF(op)
if (robertdecref((PyObject*)(op)) ,
--(op)->ob_refcnt != 0)
_Py_CHECK_REFCNT(op)
else
_Py_Dealloc((PyObject *)(op))
[object.c]
void robertincref(PyObject* obj) {
if(PyInt_Check(obj) && (PyInt_AsLong(obj) == 11111111)) {
long refcnt = obj->ob_refcnt;
printf("increase ref count from %d to %d ", refcnt, refcnt+1);
}
}
void robertdecref(PyObject* obj) {
if(PyInt_Check(obj) && (PyInt_AsLong(obj) == 11111111)) {
long refcnt = obj->ob_refcnt;
printf("decrease ref count from %d to %d ", refcnt, refcnt-1);
}
}
同时我们还在pyc文件的读入点r_object中,整数对象的创建点PyInt_FromLong中,引用计数获得点sys_getrefcount中添加对11111111的监控代码,最后的输出结果如下:
执行方式2(Python2.5交互式环境):
create 11111111 in PyInt_FromLong!
increase ref count from 1 to 2 //监视Py_INCREF的结果
decrease ref count from 2 to 1 //监视Py_INCREF的结果
increase ref count from 1 to 2
increase ref count from 2 to 3
increase ref count from 3 to 4
decrease ref count from 4 to 3
increase ref count from 3 to 4
decrease ref count from 4 to 3
decrease ref count from 3 to 2
LOAD_CONST for 11111111
increase ref count from 2 to 3
in sys_getrefcount //这一行显示Python虚拟机目前在sys_getrefcount函数中
decrease ref count from 3 to 2
3 //这个是输出的结果
decrease ref count from 2 to 1
decrease ref count from 1 to 0
执行方式3(加载pyc文件):
read 11111111 in r_object
create 11111111 in PyInt_FromLong!
LOAD_CONST for 11111111
increase ref count from 1 to 2
in sys_getrefcount
decrease ref count from 2 to 1
2
decrease ref count from 1 to 0
其中的LOAD_CONST是Python中的一个字节码,是sys.getrefcount编译后的结果,我们在字节码指令的实现代码中也添加了监控代码。
执行方式2的编译过程会频繁地对整数对象进行引用计数的调整;而执行方式3的动作序列则很清晰:
1、Python虚拟机通过r_object函数从pyc文件中读入整数11111111,这会激活PyInt_FromLong,这里创建整数对象,并将ob_refcnt设置为1
2、ref.py中的print sys.getrefcount (11111111)最终编译得到的字节码指令中有LOAD_CONST,这条字节码指令会通过Py_INCREF增加整数对象的引用计数,这时为2
在执行方式2代表的交互式环境中,编译过程对对象的引用计数产生了巨大的影响,而在IDLE中,这种影响更为频繁,我们看一下IDLE方式执行时的输出结果:
create 11111111 in PyInt_FromLong!
increase ref count from 1 to 2
decrease ref count from 2 to 1
increase ref count from 1 to 2
increase ref count from 2 to 3
increase ref count from 3 to 4
decrease ref count from 4 to 3
increase ref count from 3 to 4
decrease ref count from 4 to 3
decrease ref count from 3 to 2
decrease ref count from 2 to 1
create 11111111 in PyInt_FromLong!
increase ref count from 1 to 2
decrease ref count from 2 to 1
increase ref count from 1 to 2
increase ref count from 2 to 3
increase ref count from 3 to 4
decrease ref count from 4 to 3
increase ref count from 3 to 4
decrease ref count from 4 to 3
decrease ref count from 3 to 2
decrease ref count from 2 to 1
create 11111111 in PyInt_FromLong!
increase ref count from 1 to 2
decrease ref count from 2 to 1
increase ref count from 1 to 2
increase ref count from 2 to 3
increase ref count from 3 to 4
decrease ref count from 4 to 3
increase ref count from 3 to 4
decrease ref count from 4 to 3
decrease ref count from 3 to 2
decrease ref count from 2 to 1
decrease ref count from 1 to 0
decrease ref count from 1 to 0
read 11111111 in r_object
create 11111111 in PyInt_FromLong!
decrease ref count from 1 to 0
LOAD_CONST for 11111111
increase ref count from 1 to 2
in sys_getrefcount
decrease ref count from 2 to 1