使用 C++ 检查两个字符串是否为字谜

Check whether two strings are anagrams using C++

本文关键字：是否字符串两个检查 C++ 使用更新时间：2023-10-16

我想出了下面的程序来检查两个字符串是否是字谜。它对小字符串工作正常，但对于较大的字符串(我尝试：听，入伍(它给了我一个"不！

帮助！

#include<iostream.h> 
#include<string.h>
#include<stdio.h>
int main()
{
    char str1[100], str2[100];
    gets(str1);
    gets(str2);
    int i,j;
    int n1=strlen(str1);
    int n2=strlen(str2);
    int c=0;
    if(n1!=n2)
    {
          cout<<"nThey are not anagrams ! ";
          return 0;
    }
    else 
    {
         for(i=0;i<n1;i++)
             for(j=0;j<n2;j++)
                 if(str1[i]==str2[j])
                     ++c;
    }
    if(c==n1)
        cout<<"yes ! anagram !! ";
    else 
        cout<<"no ! ";
    system("pause");
    return 0;
}

我很懒，所以我会使用标准库功能对两个字符串进行排序，然后比较它们：

#include <string>
#include <algorithm>
bool is_anagram(std::string s1, std::string s2)
{
  std::sort(s1.begin(), s1.end());
  std::sort(s2.begin(), s2.end());
  return s1 == s2;
}

一个小的优化可能是在排序之前检查字符串的大小是否相同。

但是，如果这个算法被证明是一个瓶颈，我会暂时摆脱一些懒惰，并将其与一个简单的计数解决方案进行比较：

比较字符串长度
实例化计数映射，std::unordered_map<char, unsigned int> m
循环遍历s1，递增每个char的计数。
遍历s2，递减每个char的计数，然后检查计数是否0

当被要求查找aa和aa是否是字谜时，算法也会失败。尝试在心理上或在调试器中跟踪算法的步骤以找到原因;这样你就会学到更多。

顺便一提。。查找字谜的常用方法是计算每个字母在字符串中出现的次数。每个字母的计数应相等。这种方法具有 O(n( 时间复杂度，而不是 O(n²(。

bool areAnagram(char *str1, char *str2)
{
    // Create two count arrays and initialize all values as 0
    int count1[NO_OF_CHARS] = {0};
    int count2[NO_OF_CHARS] = {0};
    int i;
    // For each character in input strings, increment count in
    // the corresponding count array
    for (i = 0; str1[i] && str2[i];  i++)
    {
        count1[str1[i]]++;
        count2[str2[i]]++;
    }
    // If both strings are of different length. Removing this condition
    // will make the program fail for strings like "aaca" and "aca"
    if (str1[i] || str2[i])
      return false;
    // Compare count arrays
    for (i = 0; i < NO_OF_CHARS; i++)
        if (count1[i] != count2[i])
            return false;
    return true;
}

我在下面看到 2 种主要方法：

排序然后比较
计算每个字母的出现次数

有趣的是，Suraj 的好解决方案得到了一分(在撰写本文时由我(，但排序得到了 22 分。解释是性能不在人们的脑海中 - 这对于短弦来说很好。

排序实现只有 3 行长，但计数实现对于长字符串来说比平方。它要快得多(O(N(与O(NlogN(相比(。使用 500 MB 长字符串获得以下结果。

排序 - 162.8 秒
计数 - 2.864 秒
多线程计数 - 3.321 秒

多线程尝试是一种幼稚的尝试，它试图通过在单独的线程中计数来将速度加倍，每个线程一个字符串。内存访问是瓶颈，这是多线程使事情变得更糟的示例。

我很高兴看到一些可以加快计数解决方案的想法(想想那些擅长内存延迟问题、缓存的人(。

#include<stdio.h>
#include<string.h>
int is_anagram(char* str1, char* str2){
    if(strlen(str1)==strspn(str1,str2) && strlen(str1)==strspn(str2,str1) &&
    strlen(str1)==strlen(str2))
    return 1;
    return 0;
}
int main(){
    char* str1 = "stream";
    char* str2 = "master";
    if(is_anagram(str1,str2))
    printf("%s and %s  are anagram to each other",str1,str2);
    else
    printf("%s and %s  are not anagram to each other",str1,str2);
    return 0;
}

#include<iostream>
#include<unordered_map>
using namespace std;
int checkAnagram (string &str1, string &str2)
{
    unordered_map<char,int> count1, count2;
    unordered_map<char,int>::iterator it1, it2;
    int isAnagram = 0;
    if (str1.size() != str2.size()) {
        return -1;
    }
    for (unsigned int i = 0; i < str1.size(); i++) {
        if (count1.find(str1[i]) != count1.end()){
            count1[str1[i]]++;
        } else {
            count1.insert(pair<char,int>(str1[i], 1));
        }
    }
    for (unsigned int i = 0; i < str2.size(); i++) {
        if (count2.find(str2[i]) != count2.end()) {
            count2[str2[i]]++;
        } else {
            count2.insert(pair<char,int>(str2[i], 1));
        }
    }
    for (unordered_map<char, int>::iterator itUm1 = count1.begin(); itUm1 != count1.end(); itUm1++) {
        unordered_map<char, int>::iterator itUm2 = count2.find(itUm1->first);
        if (itUm2 != count2.end()) {
            if (itUm1->second != itUm2->second){
                isAnagram = -1;
                break;
            }
        }
    }
    return isAnagram;
}
int main(void)
{
    string str1("WillIamShakespeare");
    string str2("IamaWeakishSpeller");
    cout << "checkAnagram() for " << str1 << "," << str2 << " : " << checkAnagram(str1, str2) << endl;
    return 0;
}

有趣的是，有时最好的问题就是最简单的问题。

这里的问题是如何推断两个单词是否是字谜 - 一个单词本质上是一个未排序的多组字符。

我们知道我们必须排序，但理想情况下，我们希望避免排序的时间复杂度。

事实证明，在许多情况下，我们可以通过同时运行它们并将字符值异或到累加器中来消除许多线性时间上不同的单词。如果两个字符串都是字谜，则两个字符串中所有字符的总 XOR 必须为零，无论顺序如何。这是因为任何与自身相伴的东西都变成了零。

当然，反之则不然。仅仅因为累加器为零并不意味着我们有字谜匹配。

使用这些信息，我们可以消除许多没有排序的非字谜，至少短路非字谜的情况。

#include <iostream>
#include <string>
#include <algorithm>
//
// return a sorted copy of a string
//
std::string sorted(std::string in)
{
    std::sort(in.begin(), in.end());
    return in;
}
//
// check whether xor-ing the values in two ranges results in zero.
// @pre first2 addresses a range that is at least as big as (last1-first1)
//
bool xor_is_zero(std::string::const_iterator first1,
                 std::string::const_iterator last1,
                 std::string::const_iterator first2)
{
    char x = 0;
    while (first1 != last1) {
        x ^= *first1++;
        x ^= *first2++;
    }
    return x == 0;
}
//
// deduce whether two strings are the same length
//
bool same_size(const std::string& l, const std::string& r)
{
    return l.size() == r.size();
}
//
// deduce whether two words are anagrams of each other
// I have passed by const ref because we may not need a copy
//
bool is_anagram(const std::string& l, const std::string& r)
{
    return same_size(l, r)
    && xor_is_zero(l.begin(), l.end(), r.begin())
    && sorted(l) == sorted(r);
}
// test
int main()  {
    using namespace std;
    auto s1 = "apple"s;
    auto s2 = "eppla"s;
    cout << is_anagram(s1, s2) << 'n';
    s2 = "pppla"s;
    cout << is_anagram(s1, s2) << 'n';
    return 0;
}

预期：

1
0

试试这个：

// Anagram. Two words are said to be anagrams of each other if the letters from one word can be rearranged to form the other word. 
// From the above definition it is clear that two strings are anagrams if all characters in both strings occur same number of times. 
// For example "xyz" and "zxy" are anagram strings, here every character 'x', 'y' and 'z' occur only one time in both strings. 
#include <map>
#include <string>
#include <cctype>
#include <iostream> 
#include <algorithm>
#include <unordered_map>
using namespace std;
bool IsAnagram_1( string w1, string w2 )
{
    // Compare string lengths
    if ( w1.length() != w2.length() )
        return false;
    sort( w1.begin(), w1.end() );
    sort( w2.begin(), w2.end() );
    return w1 == w2;
}
map<char, size_t> key_word( const string & w )
{
    // Declare a map which is an associative container that will store a key value and a mapped value pairs
    // The key value is a letter in a word and the maped value is the number of times this letter appears in the word
   map<char, size_t> m;
   // Step over the characters of string w and use each character as a key value in the map
   for ( auto & c : w )
   {
        // Access the mapped value directly by its corresponding key using the bracket operator
        ++m[toupper( c )];  
   }
   return ( m );
}

bool IsAnagram_2( const string & w1, const string & w2 )  
{
    // Compare string lengths
    if ( w1.length() != w2.length() )
        return false;
    return ( key_word( w1 ) == key_word( w2 ) );
}

bool IsAnagram_3( const string & w1, const string & w2 )  
{
    // Compare string lengths
    if ( w1.length() != w2.length() )
        return false;
    // Instantiate a count map, std::unordered_map<char, unsigned int> m
    unordered_map<char, size_t> m;
    // Loop over the characters of string w1 incrementing the count for each character
   for ( auto & c : w1 )
   {
        // Access the mapped value directly by its corresponding key using the bracket operator
        ++m[toupper(c)];    
   }
    // Loop over the characters of string w2 decrementing the count for each character
   for ( auto & c : w2 )
   {
        // Access the mapped value directly by its corresponding key using the bracket operator
        --m[toupper(c)];    
   }
   // Check to see if the mapped values are all zeros
   for ( auto & c : w2 )
   {
        if ( m[toupper(c)] != 0 )
            return false;
   }
   return true;
}

int main( )
{
    string word1, word2;
    cout << "Enter first word: ";
    cin >> word1;
    cout << "Enter second word: ";
    cin >> word2;
    if ( IsAnagram_1( word1, word2 ) )
        cout << "nAnagram" << endl;
    else 
        cout << "nNot Anagram" << endl;

    if ( IsAnagram_2( word1, word2 ) )
        cout << "nAnagram" << endl;
    else 
        cout << "nNot Anagram" << endl;

    if ( IsAnagram_3( word1, word2 ) )
        cout << "nAnagram" << endl;
    else 
        cout << "nNot Anagram" << endl;
    system("pause");
    return 0;
}

在这种方法中，我还处理了空字符串和重复字符。享受它并评论任何限制。

#include <iostream>
#include <map>
#include <string>
using namespace std;
bool is_anagram( const string a, const string b ){
    std::map<char, int> m;
    int count = 0;
    for (int i = 0; i < a.length(); i++) {
        map<char, int>::iterator it = m.find(a[i]);
        if (it == m.end()) {
            m.insert(m.begin(), pair<char, int>(a[i], 1));
        } else {
            m[a[i]]++;
        }
    }
    for (int i = 0; i < b.length(); i++) {
        map<char, int>::iterator it = m.find(b[i]);
        if (it == m.end()) {
            m.insert(m.begin(), pair<char, int>(b[i], 1));
        } else {
            m[b[i]]--;
        }
    }
    if (a.length() <= b.length()) {
        for (int i = 0; i < a.length(); i++) {
            if (m[a[i]] >= 0) {
                count++;
            } else
                return false;
        }
        if (count == a.length() && a.length() > 0)
            return true;
        else
            return false;
    } else {
        for (int i = 0; i < b.length(); i++) {
            if (m[b[i]] >= 0) {
                count++;
            } else {
                return false;
            }
        }
        if (count == b.length() && b.length() > 0)
            return true;
        else
            return false;
    }
    return true;
}

检查两个字符串对于每个唯一字符是否具有相同的计数。

bool is_Anagram_String(char* str1,char* str2){
    int first_len=(int)strlen(str1);
    int sec_len=(int)strlen(str2);
    if (first_len!=sec_len) 
         return false;
    int letters[256] = {0};
    int num_unique_chars = 0;
    int num_completed_t = 0;
    for(int i=0;i<first_len;++i){
        int char_letter=(int)str1[i];
        if(letters[char_letter]==0)
            ++num_unique_chars;
        ++letters[char_letter];
    }
    for (int i = 0; i < sec_len; ++i) {
        int c = (int) str2[i];
        if (letters[c] == 0) { // Found more of char c in t than in s.
            return false;
        }
        --letters[c];
        if (letters[c] == 0) {
            ++num_completed_t;
            if (num_completed_t == num_unique_chars) {
                // it’s a match if t has been processed completely
                return i == sec_len - 1;
            }
        }
   }
    return false;}

#include <iostream>
#include <string.h>
using namespace std;
const int MAX = 100;
char cadA[MAX];
char cadB[MAX];
bool chrLocate;

int i,m,n,j, contaChr;
void buscaChr(char [], char []);
int main() {
    cout << "Ingresa CadA: ";
    cin.getline(cadA, sizeof(cadA));
    cout << "Ingresa CadB: ";
    cin.getline(cadB, sizeof(cadA));    
    if ( strlen(cadA) == strlen(cadB) ) {
        buscaChr(cadA,cadB);        
    } else {
        cout << "No son Anagramas..." << endl;
    }
    return 0;
}

void buscaChr(char a[], char b[]) {
    j = 0;  
    contaChr = 0;
    for ( i = 0; ( (i < strlen(a)) && contaChr < 2 ); i++ ) {
        for ( m = 0; m < strlen(b); m++ ) {
            if ( a[i] == b[m]) {
                j++;
                contaChr++;
                a[i] = '-';
                b[m] = '+';
            } else { contaChr = 0;  }
        }       
    }
    if ( j ==  strlen(a)) {
        cout << "SI son Anagramas..." << endl;
    } else {
        cout << "No son Anagramas..." << endl;
    }
}

您的算法不正确。您正在检查第一个单词中的每个字符，以查看该字符在第二个单词中出现的次数。如果这两个词是"aaaa"和"aaaa"，那么你会得到16个。对代码进行小的更改将允许它工作，但给出 N^2 的复杂性，因为您有一个双循环。

for(i=0;i<n1;i++)
for(j=0;j<n2;j++)
if(str1[i]==str2[j])
++c, str2[j] = 0; // 'cross off' letters as they are found.

我用字谜比较做了一些测试。比较两个字符串，每个字符串 72 个字符(字符串始终是真正的字谜以获得最大数量的比较(，使用几个不同的 STL 容器执行 256 个相同的测试......

template<typename STORAGE>
bool isAnagram(const string& s1, const string& s2, STORAGE& asciiCount)
{
    for(auto& v : s1)
    {
        asciiCount[v]++;    
    }
    for(auto& v : s2)
    {
        if(--asciiCount[static_cast<unsigned char>(v)] == -1)
        {
            return false;
        }
    }
    return true;
}

其中存储 asciiCount =

map<char, int> storage;           // 738us 
unordered_map<char, int> storage; // 260us 
vector<int> storage(256);         // 43us

//g++ -std=c++17 -O3 -Wall -pedantic

这是我能得到的最快的。

这些是使用 coliru 在线编译器 + 和 std：：chrono：：steady_clock：：time_point 进行测量的粗略测试，但它们给出了性能提升的一般概念。
vector 具有相同的性能，仅使用 256 个字节，尽管字符串的长度限制为 255 个字符(也更改为：--asciiCount[static_cast(v(] == 255 用于无符号字符计数(。
假设向量是最快的。一个改进是只分配一个 C 样式数组无符号字符 asciiCount[256];在堆栈上(因为 STL 容器在堆上动态分配其内存(
您可能会将此存储减少到 128 字节、64 甚至 32 字节(ascii 字符通常在 0..127 范围内，而 A-Z+a-z 64.127，只有大写或小写 64..95 或 96...127(，尽管不确定将其安装在缓存行或一半内会有什么好处。

有什么更好的方法可以做到这一点吗？为了速度、内存、代码优雅？

1.删除匹配字符的简单快捷方法

bool checkAnagram(string s1, string s2) {
    for (char i : s1) {
        unsigned int pos = s2.find(i,0);
        if (pos != string::npos) {
            s2.erase(pos,1);
        } else {
            return false;
        }
    }
    return s2.empty();
}

2. 转换为质数。漂亮但非常昂贵，需要特殊的大整数类型用于长字符串。

// https://en.wikipedia.org/wiki/List_of_prime_numbers
int primes[255] = {2, 3, 5, 7, 11, 13, 17, 19, ... , 1613}; 
bool checkAnagramPrimes(string s1, string s2) {
    long c1 = 1;
    for (char i : s1) {
        c1 = c1 * primes[i];
    }
    long c2 = 1;
    for (char i : s2) {
        c2 = c2 * primes[i];
        if (c2 > c1) {
            return false;
        }
    }
    return c1 == c2;
}

    string key="listen";
    string key1="silent";
    string temp=key1;  
    int len=0;
    //assuming both strings are of equal length
    for (int i=0;i<key.length();i++){
            for (int j=0;j<key.length();j++){
                if(key[i]==temp[j]){
                    len++;
                    temp[j] = ' ';//to deal with the duplicates
                    break;
                }
            }
        }
cout << (len==key.length());    //if true: means the words are anagrams

而不是使用在现代 c++ 中不推荐使用的点 h 标头。

试试这个解决方案。

        #include <iostream>
        #include <string>
        #include <map>
        int main(){
          std::string word_1 {};
          std::cout << "Enter first word: ";
          std::cin >> word_1;
          std::string word_2 {};
          std::cout << "Enter second word: ";
          std::cin >> word_2;
          if(word_1.length() == word_2.length()){
            std::map<char, int> word_1_map{};
            std::map<char, int> word_2_map{};
            for(auto& c: word_1)
              word_1_map[std::tolower(c)]++;
            for(auto& c: word_2)
              word_2_map[std::tolower(c)]++;
            if(word_1_map == word_2_map){
              std::cout << "Anagrams" << std::endl;
            }
            else{
              std::cout << "Not Anagrams" << std::endl;
            }
          }else{
            std::cout << "Length Mismatch" << std::endl;
          }
        }

#include <bits/stdc++.h>
using namespace std;
#define NO_OF_CHARS 256
int main()
{   bool ans = true;
    string word1 = "rest";
    string word2 = "tesr";
    unordered_map<char,int>maps;
    for(int i = 0 ; i <5 ; i++)
    {
        maps[word1[i]] +=1;
    }
    for(int i = 0 ; i <5 ; i++)
    {
        maps[word2[i]]-=1 ;
    }
    for(auto i : maps)
    {
        if(i.second!=0)
        {
            ans = false;
        }
    }
    cout<<ans;
}

Well if you don't want to sort than this code will give you perfect output. 
#include <iostream>
using namespace std;
int main(){
string a="gf da";
string b="da gf";
int al,bl;
int counter =0;
al =a.length();
bl =b.length();
for(int i=0 ;i<al;i++){
    for(int j=0;j<bl;j++){
        if(a[i]==b[j]){
            if(j!=bl){
                b[j]=b[b.length()-counter-1];
                bl--;
                counter++;
                break;
            }else{
                bl--;
                counter++;
            }
                
        }
    }
}
if(counter==al){
    cout<<"true";
}
else{
    cout<<"false";
}
return 0;
}

这是检查字谜的最简单，最快的方法

bool anagram(string a, string b) {
    int a_sum = 0, b_sum = 0, i = 0;
    while (a[i] != '') {
        a_sum += (int)a[i]; // (int) cast not necessary
        b_sum += (int)b[i];
        i++;
    }
    return a_sum == b_sum;
}

只需将 ASCII 值相加并检查总和是否相等。

例如：字符串 A = "NAP"，字符串 B = "平移">
a_sum = 110 + 97 + 112 = 319
b_sum = 112 + 97 + 110 = 319