用c++ Socket只接收必要的数据

Receiving only necessary data with C++ Socket

本文关键字:数据 c++ Socket      更新时间:2023-10-16

我只是想得到一个页面的内容与他们的标题…但是似乎我的大小为1024的缓冲区对于最后一个信息包来说要么太大要么太小了…如果有意义的话,我不想得到太多或太少。这是我的代码。它打印出的页面信息很好,但我想确保它是正确的。

//Build HTTP Get Request
std::stringstream ss;
ss << "GET " << url << " HTTP/1.0rnHost: " << strHostName << "rnrn";
std::string req = ss.str();
// Send Request
send(hSocket, req.c_str(), strlen(req.c_str()), 0);
// Read from socket into buffer.
do
{
     nReadAmount = read(hSocket, pBuffer, sizeof pBuffer);
     printf("%s", pBuffer);
}
while(nReadAmount != 0);
 nReadAmount = read(hSocket, pBuffer, sizeof pBuffer);
 printf("%s", pBuffer);

这是坏的。您只能对c样式(零终止)字符串使用%s格式说明符。printf怎么知道要打印多少字节?这个信息是在nReadAmount,但你不使用它。

同样,即使read失败,也可以调用printf

最简单的修复:

 do
 {
     nReadAmount = read(hSocket, pBuffer, (sizeof pBuffer) - 1);
     if (nReadAmount <= 0)
         break;
     pBuffer[nReadAmount] = 0;
     printf("%s", pBuffer);
 } while(1);

阅读HTTP回复的正确方法是阅读,直到您收到完整的LF分隔行(一些服务器使用bare LF,尽管官方规范说使用CRLF),其中包含响应代码和版本,然后继续阅读lf分隔行,这是标头,直到您遇到0长度的行,表示标头的结束。然后,您必须分析报头,以弄清楚其余数据是如何编码的,以便您知道读取它的正确方法以及它是如何终止的。有几种不同的可能性,请参阅RFC 2616 Section 4.4了解实际规则。

换句话说,你的代码需要使用这种结构来代替(伪代码):

// Send Request
send(hSocket, req.c_str(), req.length(), 0);
// Read Response
std::string line = ReadALineFromSocket(hSocket);
int rescode = ExtractResponseCode(line);
std::vector<std::string> headers;
do
{
     line = ReadALineFromSocket(hSocket);
     if (line.length() == 0) break;
     headers.push_back(line);
}
while (true);
if (
    ((rescode / 100) != 1) &&
    (rescode != 204) &&
    (rescode != 304) &&
    (request is not "HEAD")
)
{
    if ((headers has "Transfer-Encoding") && (Transfer-Encoding != "identity"))
    {
        // read chunks until a 0-length chunk is encountered.
        // refer to RFC 2616 Section 3.6 for the format of the chunks...
    }
    else if (headers has "Content-Length")
    {
       // read how many bytes the Content-Length header says...
    }
    else if ((headers has "Content-Type") && (Content-Type == "multipart/byteranges"))
    {
        // read until the terminating MIME boundary specified by Content-Type is encountered...
    }
    else
    {
        // read until the socket is disconnected...
    }
}