直接执行数据

Direct execution of the data

本文关键字：数据执行数执行更新时间：2023-10-16

让程序执行数据的最佳方法是什么。比如说，我为x86_64机器编写了（所谓的）编译器：

#include <iostream>
#include <vector>
#include <cstdlib>
#include <cstdint>
struct compiler
{
    void op() const { return; }
    template< typename ...ARGS >
    void op(std::uint8_t const _opcode, ARGS && ..._tail)
    {
        code_.push_back(_opcode);
        return op(std::forward< ARGS >(_tail)...);
    }
    void clear() { code_.clear(); }
    long double operator () () const
    {
        // ?
    }
private :
    std::vector< std::uint8_t > code_;
};
int main()
{
    compiler compiler_; // long double (*)();
    compiler_.op(0xD9, 0xEE); // FLDZ
    compiler_.op(0xC3);       // ret
    std::cout << compiler_() << std::endl;
    return EXIT_SUCCESS;
}

但是我不知道如何正确地实现operator ()。我怀疑，我必须将code_的所有内容放入连续的内存块中，然后强制转换为long double (*)();并调用它。但也有一些困难：

我应该在Windows上使用VirtualProtect(Ex)（+FlushInstructionCache）吗？Linux上也有类似的东西吗
什么是容器，它以正确的方式（即一个接一个）将字节可靠地放置在内存中？并且还允许获取指向内存块的指针

首先，您需要将代码分配为可执行代码[在Windows中使用带有"可执行"标志的VirtualAlloc，并使用"MAP_executable"作为标志之一的mmap]。分配这种内存的大区域，然后为内容设置"分配功能"，可能会容易得多。您可以在Linux中使用virtualprotect和任何相应的函数，但我认为首先将其分配为可执行文件是一个更好的选择。我认为你不需要刷新指令缓存——内存已经被分配为可执行的——当然至少在x86上不是这样——而且由于你的指令是x86指令，我想这是一个合理的限制。

其次，您需要制作一个类似于指向代码的函数指针的东西。像这样的东西应该可以做到：

typedef void (*funcptr)(void); 
funcptr f = reinterpret_cast<funcptr>(&code_[0]);

应该做到这一点。