Python在简单数组查找上的小循环的性能问题

Performance issues in Python for small loops over simple array look-ups

本文关键字:小循环 性能 问题 查找 简单 数组 Python      更新时间:2023-10-16

为什么简单的循环和/或简单的数组查找在Python中如此缓慢?

具体来说,在下面的例子中,Python(使用pypy)比c++(使用-O2)慢大约9倍。性能损失的技术原因是什么?是机器代码中Python循环的实现吗?编译器使用的优化的差异?内存管理?还是别的什么?

Python代码:
# File: timing.py
import sys
T = [ 
        [ 0, 6, 7, 7, 7, 8, 6, 7, 8,19,20, 7, 7, 7, 7, 7, 7,21,22,19,20,21,22,29, 7, 7, 7, 7, 7,29,35,36, 7, 7, 7,35,36, 7, 7,], 
        [ 9, 7,10, 7, 7, 7, 1, 7,23, 7, 7, 1,23, 7, 7, 7, 7, 7, 7, 9,10,30,31, 7, 9,10,30,31, 7,23, 7, 7,23, 7, 7,30,31,30,31,], 
        [ 7, 7, 2,11,12, 7, 7, 7, 7, 7, 7,11,12,24,25,26,27, 7, 7, 7, 7, 7, 7, 7,24,25,26,27,32, 7, 7, 7,32,37,38, 7, 7,37,38,], 
        [13, 7,14, 7, 7, 7, 3, 7,28, 7, 7, 3,28, 7, 7, 7, 7, 7, 7,13,14,33,34, 7,13,14,33,34, 7,28, 7, 7,28, 7, 7,33,34,33,34,], 
        [15, 7,16, 7, 7, 7, 7, 7, 4, 7, 7, 7, 4, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [17, 7,18, 7, 7, 7, 7, 7, 5, 7, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [19, 7,20, 7, 7, 7, 6, 7,29, 7, 7, 6,29, 7, 7, 7, 7, 7, 7,19,20,35,36, 7,19,20,35,36, 7,29, 7, 7,29, 7, 7,35,36,35,36,], 
        [ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [21, 7,22, 7, 7, 7, 7, 7, 8, 7, 7, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 9, 1, 7, 7, 7,23, 1, 7,23, 9,10, 7, 7, 7, 7, 7, 7,30,31, 9,10,30,31,23, 7, 7, 7, 7, 7,23,30,31, 7, 7, 7,30,31, 7, 7,], 
        [ 7, 7,10, 1,23, 7, 7, 7, 7, 7, 7, 1,23, 9,10,30,31, 7, 7, 7, 7, 7, 7, 7, 9,10,30,31,23, 7, 7, 7,23,30,31, 7, 7,30,31,], 
        [24, 7,25, 7, 7, 7,11, 7,32, 7, 7,11,32, 7, 7, 7, 7, 7, 7,24,25,37,38, 7,24,25,37,38, 7,32, 7, 7,32, 7, 7,37,38,37,38,], 
        [26, 7,27, 7, 7, 7, 7, 7,12, 7, 7, 7,12, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [13, 3, 7, 7, 7,28, 3, 7,28,13,14, 7, 7, 7, 7, 7, 7,33,34,13,14,33,34,28, 7, 7, 7, 7, 7,28,33,34, 7, 7, 7,33,34, 7, 7,], 
        [ 7, 7,14, 3,28, 7, 7, 7, 7, 7, 7, 3,28,13,14,33,34, 7, 7, 7, 7, 7, 7, 7,13,14,33,34,28, 7, 7, 7,28,33,34, 7, 7,33,34,], 
        [15, 7, 7, 7, 7, 4, 7, 7, 4, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 7, 7,16, 7, 4, 7, 7, 7, 7, 7, 7, 7, 4, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [17, 7, 7, 7, 7, 5, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 7, 7,18, 7, 5, 7, 7, 7, 7, 7, 7, 7, 5, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [19, 6, 7, 7, 7,29, 6, 7,29,19,20, 7, 7, 7, 7, 7, 7,35,36,19,20,35,36,29, 7, 7, 7, 7, 7,29,35,36, 7, 7, 7,35,36, 7, 7,], 
        [ 7, 7,20, 6,29, 7, 7, 7, 7, 7, 7, 6,29,19,20,35,36, 7, 7, 7, 7, 7, 7, 7,19,20,35,36,29, 7, 7, 7,29,35,36, 7, 7,35,36,], 
        [21, 7, 7, 7, 7, 8, 7, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 7, 7,22, 7, 8, 7, 7, 7, 7, 7, 7, 7, 8, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [30, 7,31, 7, 7, 7, 7, 7,23, 7, 7, 7,23, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [24,11, 7, 7, 7,32,11, 7,32,24,25, 7, 7, 7, 7, 7, 7,37,38,24,25,37,38,32, 7, 7, 7, 7, 7,32,37,38, 7, 7, 7,37,38, 7, 7,], 
        [ 7, 7,25,11,32, 7, 7, 7, 7, 7, 7,11,32,24,25,37,38, 7, 7, 7, 7, 7, 7, 7,24,25,37,38,32, 7, 7, 7,32,37,38, 7, 7,37,38,], 
        [26, 7, 7, 7, 7,12, 7, 7,12, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 7, 7,27, 7,12, 7, 7, 7, 7, 7, 7, 7,12, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [33, 7,34, 7, 7, 7, 7, 7,28, 7, 7, 7,28, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [35, 7,36, 7, 7, 7, 7, 7,29, 7, 7, 7,29, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [30, 7, 7, 7, 7,23, 7, 7,23, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 7, 7,31, 7,23, 7, 7, 7, 7, 7, 7, 7,23, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [37, 7,38, 7, 7, 7, 7, 7,32, 7, 7, 7,32, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [33, 7, 7, 7, 7,28, 7, 7,28, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 7, 7,34, 7,28, 7, 7, 7, 7, 7, 7, 7,28, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [35, 7, 7, 7, 7,29, 7, 7,29, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 7, 7,36, 7,29, 7, 7, 7, 7, 7, 7, 7,29, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [37, 7, 7, 7, 7,32, 7, 7,32, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
        [ 7, 7,38, 7,32, 7, 7, 7, 7, 7, 7, 7,32, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,], 
            ]
M = range(39)
idempotents = [0,2,6,7,8,9,11,12,14,16,17,19,21,25,27]
omega = [0,7,2,7,7,7,6,7,8,9,7,11,12,7,14,7,16,17,7,19,7,21,7,7,7,25,7,27,7,7,7,7,7,7,7,7,7,7,7]
def check():
    for e in idempotents:
        for x in M:
            ex = T[e][x]
            for s in M:
                es = T[e][s]
                for f in idempotents:
                    exf = T[ex][f]
                    esf = T[es][f]
                    for y in M:
                        exfy = omega[T[exf][y]]
                        for t in M:
                            tesf = omega[T[t][esf]]
                            if T[T[exfy][exf]][tesf] != T[T[exfy][esf]][tesf]:
                                return 0
    return 1
sys.exit(check())

c++代码(需要c++ 11,因为新的数组初始化和迭代语法):

// File: timing.cc
// Compile via 'g++ -std=c++11 -O2 timing.cc'
// Run via 'time ./a.out'
#include <vector>
#include <cstddef>
int main(int, char **) {
  const size_t N = 39;
  typedef unsigned element_t;
  const std::vector<std::vector<element_t>> T{{
     {{ 0, 6, 7, 7, 7, 8, 6, 7, 8,19,20, 7, 7, 7, 7, 7, 7,21,22,19,20,21,22,29, 7, 7, 7, 7, 7,29,35,36, 7, 7, 7,35,36, 7, 7,}}, 
     {{ 9, 7,10, 7, 7, 7, 1, 7,23, 7, 7, 1,23, 7, 7, 7, 7, 7, 7, 9,10,30,31, 7, 9,10,30,31, 7,23, 7, 7,23, 7, 7,30,31,30,31,}}, 
     {{ 7, 7, 2,11,12, 7, 7, 7, 7, 7, 7,11,12,24,25,26,27, 7, 7, 7, 7, 7, 7, 7,24,25,26,27,32, 7, 7, 7,32,37,38, 7, 7,37,38,}}, 
     {{13, 7,14, 7, 7, 7, 3, 7,28, 7, 7, 3,28, 7, 7, 7, 7, 7, 7,13,14,33,34, 7,13,14,33,34, 7,28, 7, 7,28, 7, 7,33,34,33,34,}}, 
     {{15, 7,16, 7, 7, 7, 7, 7, 4, 7, 7, 7, 4, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{17, 7,18, 7, 7, 7, 7, 7, 5, 7, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{19, 7,20, 7, 7, 7, 6, 7,29, 7, 7, 6,29, 7, 7, 7, 7, 7, 7,19,20,35,36, 7,19,20,35,36, 7,29, 7, 7,29, 7, 7,35,36,35,36,}}, 
     {{ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{21, 7,22, 7, 7, 7, 7, 7, 8, 7, 7, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 9, 1, 7, 7, 7,23, 1, 7,23, 9,10, 7, 7, 7, 7, 7, 7,30,31, 9,10,30,31,23, 7, 7, 7, 7, 7,23,30,31, 7, 7, 7,30,31, 7, 7,}}, 
     {{ 7, 7,10, 1,23, 7, 7, 7, 7, 7, 7, 1,23, 9,10,30,31, 7, 7, 7, 7, 7, 7, 7, 9,10,30,31,23, 7, 7, 7,23,30,31, 7, 7,30,31,}}, 
     {{24, 7,25, 7, 7, 7,11, 7,32, 7, 7,11,32, 7, 7, 7, 7, 7, 7,24,25,37,38, 7,24,25,37,38, 7,32, 7, 7,32, 7, 7,37,38,37,38,}}, 
     {{26, 7,27, 7, 7, 7, 7, 7,12, 7, 7, 7,12, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{13, 3, 7, 7, 7,28, 3, 7,28,13,14, 7, 7, 7, 7, 7, 7,33,34,13,14,33,34,28, 7, 7, 7, 7, 7,28,33,34, 7, 7, 7,33,34, 7, 7,}}, 
     {{ 7, 7,14, 3,28, 7, 7, 7, 7, 7, 7, 3,28,13,14,33,34, 7, 7, 7, 7, 7, 7, 7,13,14,33,34,28, 7, 7, 7,28,33,34, 7, 7,33,34,}}, 
     {{15, 7, 7, 7, 7, 4, 7, 7, 4, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 7, 7,16, 7, 4, 7, 7, 7, 7, 7, 7, 7, 4, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7,15,16, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{17, 7, 7, 7, 7, 5, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 7, 7,18, 7, 5, 7, 7, 7, 7, 7, 7, 7, 5, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7,17,18, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{19, 6, 7, 7, 7,29, 6, 7,29,19,20, 7, 7, 7, 7, 7, 7,35,36,19,20,35,36,29, 7, 7, 7, 7, 7,29,35,36, 7, 7, 7,35,36, 7, 7,}}, 
     {{ 7, 7,20, 6,29, 7, 7, 7, 7, 7, 7, 6,29,19,20,35,36, 7, 7, 7, 7, 7, 7, 7,19,20,35,36,29, 7, 7, 7,29,35,36, 7, 7,35,36,}}, 
     {{21, 7, 7, 7, 7, 8, 7, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 7, 7,22, 7, 8, 7, 7, 7, 7, 7, 7, 7, 8, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7,21,22, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{30, 7,31, 7, 7, 7, 7, 7,23, 7, 7, 7,23, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{24,11, 7, 7, 7,32,11, 7,32,24,25, 7, 7, 7, 7, 7, 7,37,38,24,25,37,38,32, 7, 7, 7, 7, 7,32,37,38, 7, 7, 7,37,38, 7, 7,}}, 
     {{ 7, 7,25,11,32, 7, 7, 7, 7, 7, 7,11,32,24,25,37,38, 7, 7, 7, 7, 7, 7, 7,24,25,37,38,32, 7, 7, 7,32,37,38, 7, 7,37,38,}}, 
     {{26, 7, 7, 7, 7,12, 7, 7,12, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 7, 7,27, 7,12, 7, 7, 7, 7, 7, 7, 7,12, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7,26,27, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{33, 7,34, 7, 7, 7, 7, 7,28, 7, 7, 7,28, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{35, 7,36, 7, 7, 7, 7, 7,29, 7, 7, 7,29, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{30, 7, 7, 7, 7,23, 7, 7,23, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 7, 7,31, 7,23, 7, 7, 7, 7, 7, 7, 7,23, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7,30,31, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{37, 7,38, 7, 7, 7, 7, 7,32, 7, 7, 7,32, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{33, 7, 7, 7, 7,28, 7, 7,28, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 7, 7,34, 7,28, 7, 7, 7, 7, 7, 7, 7,28, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7,33,34, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{35, 7, 7, 7, 7,29, 7, 7,29, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 7, 7,36, 7,29, 7, 7, 7, 7, 7, 7, 7,29, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7,35,36, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{37, 7, 7, 7, 7,32, 7, 7,32, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
     {{ 7, 7,38, 7,32, 7, 7, 7, 7, 7, 7, 7,32, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7,37,38, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,}}, 
  }};
  const std::vector<element_t> idempotents{{0,2,6,7,8,9,11,12,14,16,17,19,21,25,27}};
  const std::vector<element_t> omega{{0,7,2,7,7,7,6,7,8,9,7,11,12,7,14,7,16,17,7,19,7,21,7,7,7,25,7,27,7,7,7,7,7,7,7,7,7,7,7}};
  element_t ex, es, exf, esf, exfy, tesf;
  for(auto e: idempotents) {
    for(size_t x = 0; x < N; ++x) {
      ex = T[e][x];
      for(size_t s = 0; s < N; ++s) {
        es = T[e][s];
        for(auto f: idempotents) {
          exf = T[ex][f];
          esf = T[es][f];
          for(size_t y = 0; y < N; ++y) {
            exfy = omega[T[exf][y]];
            for(size_t t = 0; t < N; ++t) {
              tesf = omega[T[t][esf]];
              if(T[T[exfy][exf]][tesf] != T[T[exfy][esf]][tesf])
                return 0;
            }}}}}}
  return 1;
}

(不要询问代码的详细功能。粗略地说,实现了一个代数形式语言理论背景下的决策过程;该代码验证了由乘法表T给出的单oid的身份。特别是,代码不是一个人为的示例,而是一个真实的应用程序。当然,人们可以在形式语言理论的背景下讨论"应用"。

<标题> 使用上述代码,我的机器上的用户CPU运行时间如下:
    0m9.329s time pypy timing.py输出
  • time python timing.py输出2m18.389s
  • g++ -std=c++11 -O2 timing.cc && time ./a.out输出0m1.064s

编辑

  • 为了更公平的比较,我做了一些g++似乎自动合并的优化。我重新排序了循环,并将变量赋值尽可能地向外移动(正如评论所建议的那样)。这使得pypy的加速因子约为2.5,python的加速因子约为2。
  • 同样为了公平起见,我使用了动态大小的std::vector而不是恒定大小的std::array
  • 我删除了关于为什么numpy较慢(注释表明我没有正确使用它)以及为什么在脚本的主要部分执行循环较慢的旁白(众所周知,Python使用局部变量比使用全局变量更快;
  • 我知道c++和Python有不同的作用域。我也知道c++是编译的,而Python通常是解释的(这就是为什么我使用pypy)。我想知道为什么pypy在这段特定代码上如此慢的最终技术原因。(使用numpy和numba可能能够获得接近本机的性能,但这不是我的问题的精神,因为它实际上将所有的计算转移回C代码。)我相应地澄清了我的问题。

numpy和python循环的简短回答

如果您正确使用numpy,则将所有内容再次推回到c级。

长回答

怎么能这样呢?

我将在这里展示一点我的评论的意思,以及如何使用numpy避免不必要的循环。让我们看一下原始代码,假设您已经将所有列表放入np中。np.asarray(list).

for e in idempotents:
    for x in M:
        ex = T[e][x]

直接翻译为:

T[idempotents]

为什么?

Numpy数组可以使用索引数组进行索引。如。T[0]返回矩阵t的所有列(实际上是所有以下维度),因此T[0]==T[0,:]用于2d数组。由于循环遍历所有幂等幂函数作为下标,然后遍历T[e][x]列中的所有元素,因此T[idempotents]与这两个循环相同。

详细信息见这里。

, es

下一个是

for e in idempotents:
    for x in M:
        ex = T[e][x]
        for s in M:
            es = T[e][s]

由于没有必要重新执行整个循环,因此转换为

es=ex

因为我们使用的是python,所以矩阵es甚至没有被复制,只是被引用。

exf,养

我现在跳过代码片段中的一些for循环。

for f in idempotents:
    exf = T[ex][f]
    esf = T[es][f]

现在你再次用幂等向量访问最外层的索引。所以我们可以用同样的方法对numpy做同样的事情:

T[T[idempotents]]
print T[T[idempotents]].shape
>> (15, 39, 39)

现在,我们有一个维度为(15,39,39)的数组,因为对于二维数组T[幂等幂]的每个元素,你返回T的元素,这基本上是第三个循环。

exf = T[T[idempotents]]
esf = exf

从这里开始变得更复杂,我将跳过其余部分。它将是如下行:

Ti = T[idempotents]  # T[e][x] == T[e][s] by loop definition
TTi = T[Ti]  # T[T[e][x]]
TTi.shape = -1 , 39  # bring first index back into shape
exf = TTi[:, idempotents]  # T[T[e][x]][f]
esf = exf  # T[T[e][s]][f] == T[T[e][x]][f] by loop definition
Texf = T[exf].ravel()
exfy = omega[Texf]
TTexf = T.T[exf].ravel()  # tesf = omega[T[t][esf]] # since I cannot index fast along t I use the transpose of T
tesf = omega[TTexf]

因为Python在运行时解释它的输入,而c++编译器能够执行主要的优化,特别是对于固定大小数组的for循环。

有些编译器甚至会计算嵌套for循环的整个输出。