尽管在内置类型上使用了memset，但为什么它会引起问题

Why is memset causing problem despite being used on built-in types?

本文关键字：为什么问题 memset 内置置类型更新时间：2023-10-16

我对C++非常陌生，在Codeforces上提交此问题时，突然发现使用memset()会导致其中一个测试用例出现Wrong answer。

以下是测试用例：

Input:
4 4
3 3 3 5
Participant's output
NO
Jury's answer
YES
1 2 3 4 
Checker comment
wrong answer Jury has the answer but the participant hasn't

这是代码：

#include<iostream>
using namespace std;

int check_if_painted[5010][5010];
int input_array[5010];
int main(){
int n,k;
cin>>n>>k;
int occurence_count[n];//Keeps track of the total no. of occurences of an element in the input_array.
memset(occurence_count,0,sizeof(occurence_count));
/*
The following loop checks if the occurrence of a particular 
element is not more than k. If the occurence>k the "NO" is printed and program ends.
*/
for (int i = 0; i < n; ++i)
{
cin>>input_array[i];
++occurence_count[input_array[i]];
if(occurence_count[input_array[i]]>k){
cout<<"NO";
return 0;
}
}
cout<<"YESn";

/*
The following loop uses the array check_if_painted as a counter to check if the particular 
occurrence of an element has been painted with a colour from 1 to k or not. 
If some previous occurrence of this particular element has been painted with f%k+1, 
then f is incremented until we encounter any value(of `f%k+1`) from 1 to k that hasn't been 
used yet to colour and then we colour this element with that value by printing it.
*/
int f=0;//
/*
f is a global value which increments to a very large value but f%k+1 is used 
to restrict it within the 1 to k limit(both inclusive). So, we are able to check 
if any previous occurrence of the current element has already been coloured with the value f%k+1 or not.  
which essentially is 1 to k.
*/ 
for(int i=0;i<n;++i){
while(check_if_painted[input_array[i]][f%k+1]>0){
++f;
}
cout<<f%k+1<<" ";
++check_if_painted[input_array[i]][f%k+1];
++f;
}
return 0;
}

但是，当我尝试下面的代码时，它被成功地接受了。

#include<iostream>
using namespace std;

int check_if_painted[5010][5010];
int input_array[5010];
int occurence_count[5010];
int main(){
int n,k;
cin>>n>>k;


for (int i = 0; i < n; ++i)
{
cin>>input_array[i];
++occurence_count[input_array[i]];
if(occurence_count[input_array[i]]>k){
cout<<"NO";
return 0;
}
}
cout<<"YESn";

int f=0;
for(int i=0;i<n;++i){
while(check_if_painted[input_array[i]][f%k+1]>0){
++f;
}
cout<<f%k+1<<" ";
++check_if_painted[input_array[i]][f%k+1];
++f;
}
return 0;
}

从这篇关于SO的文章中，我发现memset在内置类型上运行良好。那么，在我的案例中，当它被用于默认类型的int数组时，为什么会引起问题呢。

此外，我还读到std::fill()是更好的选择，并且在这篇文章中读到memset是一个危险的函数，那么为什么它还在使用呢？

这与memset无关。您的代码超出了数组的边界，简单明了。

在您的输入案例中，您有n=4和k=4，因此occurrence_count有4个元素长(其有效索引从0到3，包括0和3)。然后，你做

cin>>input_array[i];
++occurence_count[input_array[i]];

假设最后一个值是4，那么您最终要执行++occurence_count[4]，它超出了数组的边界。这是一种未定义的行为，在您的情况下，它表现为递增不属于该数组的内存，这很可能不会从0开始，并扰乱了稍后的检查。

这个问题在您的第二个代码片段中没有出现，因为您将occurence_count5010元素设置为大元素，并且默认情况下为零，因为它是一个全局变量。

现在，如果要计算数组值的出现次数，那么将出现次数数组调整为与元素数量一样大当然是错误的——这是要读取的数字计数(调整input_array的大小也可以)，而不是可以读取的最大值。假设数组元素的值从1到5000，则occurrences数组的大小必须为5001(保持值的原样)或5000(将读取的值递减1以索引该数组)。

(通常，请小心，因为问题文本中的所有索引都是基于1的，而C++中的索引是基于0的；如果您对问题索引进行推理，然后将其用作C索引，则可能会出现一个错误，除非您将数组的大小增加了一并忽略了第0个元素)。

最后，请注意：

如果您在启用了足够多警告或使用了足够新的编译器的情况下进行编译，它会正确地抱怨memset没有定义或它是隐式定义的(使用不正确的原型BTW)；您应该使用#include <string.h>才能使用memset
正如@Nicol Bolas在他的回答中详细解释的那样，当声明一个只有在运行时才知道大小的本地数组(int occurence_count[n])时，您使用的是VLA(可变长度数组)。
VLA不是标准的C++，因此它们没有得到很好的指定，一些编译器不支持它们，而且通常在很多方面都有问题(大多数情况下，你不应该在堆栈上分配未知数量的数据，因为堆栈通常很小)；
您可能应该避免使用std::vector，或者，考虑到这个问题为您提供了颜色和元素(5000)的上限，只提供了静态数组。