Given a string \$S\,ドル find the number of "unordered anagrammatic pairs" of substrings.
Input Format
First line contains \$T\,ドル the number of testcases. Each testcase consists of string \$S\$ in one line.
Constraints
\1ドル \leq T \leq 10\$
\2ドル \leq length(S) \leq 100\$
String \$S\$ contains only the lowercase letters of the English alphabet.
Output Format
For each testcase, print the required answer in one line.
Sample Input#00
2
abba
abcd
Sample Output#00
4
0
Sample Input#01
5
ifailuhkqq
hucpoltgty
ovarjsnrbf
pvmupwjjjf
iwwhrlkpek
Sample Output#01
3
2
2
6
3
Explanation
Sample00
Let's say \$S[i,j]\$ denotes the substring \$S_i, S_i+1,\ldots ,S_j\$ .
testcase 1:
For \$S=abba\,ドル anagrammatic pairs are: \$\lbrace S[1,1],S[4,4]\rbrace ,\lbrace S[1,2],S[3,4]\rbrace ,\lbrace S[2,2],S[3,3]\rbrace, \lbrace S[1,3],S[2,4]\rbrace \$.
testcase 2:
No anagrammatic pairs.
My introduction of algorithm:
This algorithm was my most favorite string algorithm in 2016, I did study a lot of code submissions using C#. In January 2017, I read Sherlock and anagrams on this site, started to practice again and again, tried a few things on Hackerrank online judge. In terms of time complexity, the editorial note on Hackerrank gives some analysis, I am also curious to know if I miss something important there.
I learned to stay on this site and work hard on every practice to get the chance to be the best.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
class Solution
{
public class HashedAnagramString
{
/*
* Make sure that two anagram strings will have some hashed code
*
* @frequencyTable - Dictionary<char, int>
* The frequency table has to be sorted first and then construct
* a string with each char in alphabetic numbers concatenated by
* its occurrences.
*
*/
public static int GetHashedAnagram(Dictionary<char, int> frequencyTable)
{
// build frequency table dynamically, how many time? O(n*n), n
// is the string's length
StringBuilder key = new StringBuilder();
foreach (var item in frequencyTable.OrderBy(s => s.Key))
{
key.Append(item.Key + item.Value.ToString());
}
return key.ToString().GetHashCode();
}
}
static void Main(String[] args)
{
ProcessInput();
//RunSampleTestcase();
}
public static void RunSampleTestcase()
{
var hashedAnagramsDictionary = ConstructHashedAnagramsDictionary("abba");
var pairs = CalculatePairs(hashedAnagramsDictionary);
Debug.Assert(pairs == 4);
}
public static void ProcessInput()
{
var queries = int.Parse(Console.ReadLine());
while (queries-- > 0)
{
var input = Console.ReadLine();
var hashedAnagramsDictionary = ConstructHashedAnagramsDictionary(input);
Console.WriteLine(CalculatePairs(hashedAnagramsDictionary));
}
}
/*
* What should be taken cared of in the design?
* Time complexity:
* 3 for loops
* first loop, loop on the substring's length
* second loop, loop on the substring's start position
* third loop, go over each char in the substring to build a
* frequency table first; and then
* update hashed anagram counting dictionary - a statistics, basically
* tell the fact like this:
* For example, test case string abba,
* substring ab -> hashed key a1b1, value is 2, because there are
* two substrings: "ab","ba" having hashed key: "a1b1"
* Here are all possible hashed keys:
* a1 - a, a,
* b1 - b, b
* a1b1 - ab, ba
* b2 - bb
* a1b2 - abb, bba
* a2b2 - abba
*
*/
public static Dictionary<int, int> ConstructHashedAnagramsDictionary(string input)
{
var hashedAnagramsDictionary = new Dictionary<int, int>();
var length = input.Length;
for (var substringLength = 1; substringLength < length; substringLength++)
{
for (var index = 0; index <= length - substringLength; index++)
{
var frequencyTable = new Dictionary<char, int>();
// build frequency table dynamically, how many time? O(n*n),
// n is the string's length
for (var start = index; start < index + substringLength; start++)
{
char c = input[start];
if (frequencyTable.ContainsKey(c))
{
frequencyTable[c]++;
}
else
{
frequencyTable.Add(c, 1);
}
}
var key = HashedAnagramString.GetHashedAnagram(frequencyTable);
if (hashedAnagramsDictionary.ContainsKey(key))
{
hashedAnagramsDictionary[key]++;
}
else
{
hashedAnagramsDictionary.Add(key, 1);
}
}
}
return hashedAnagramsDictionary;
}
/*
* The formula to calculate pairs
* For example, test case string abba,
* substring ab -> hashed key a1b1, value is 2, because there are
* two substrings: "ab","ba" having hashed key: "a1b1"
* Here are all possible hashed keys:
* a1 - a, a,
* b1 - b, b
* a1b1 - ab, ba
* b2 - bb
* a1b2 - abb, bba
* a2b2 - abba
* So, how many pairs?
* should be 4, from 4 hashed keys: a1, b1, a1b1 and a2b2
*/
public static int CalculatePairs(Dictionary<int, int> hashedAnagrams)
{
// get pairs
int anagrammaticPairs = 0;
foreach (var count in hashedAnagrams)
{
anagrammaticPairs += count.Value * (count.Value - 1) / 2;
}
return anagrammaticPairs;
}
}
1 Answer 1
The only thing that I would perhaps comment on is that it could be done in a way that's shorter and simpler to understand (due to usage of LINQ and a different way of summing up calc (not done using the div by 2 formula)) if not every bit of performance wants to be squeezed out of the solution (I didn't handle chars directly, but worked with strings for example):
static int SherlockAndAnagrams(string s)
{
// Sorted subs with their respective occurrence frequency
var subFreqs = new Dictionary<string, int>();
var count = 0;
for (var i = 0; i < s.Length; i++)
{
for (var j = i; j < s.Length; j++)
{
var sub = new string(s.Substring(i, j - i + 1).ToCharArray().OrderBy(p => p).ToArray());
if (!subFreqs.ContainsKey(sub))
{
subFreqs.Add(sub, 1);
}
else
{
count += subFreqs[sub];
subFreqs[sub]++;
}
}
}
return count;
}
-
\$\begingroup\$ Would this of been true 4 years ago when the question was asked, or is this using something in C# that is newer? \$\endgroup\$2021年04月10日 19:54:13 +00:00Commented Apr 10, 2021 at 19:54
-
\$\begingroup\$ @pacmaninbw No new c# syntax, just quite a bit shorter and easier to get a grasp on solution I would say. If it doesn't qualify as a good response I can remove it, please let me know. But OP's entry also isn't really a question, he just posted his solution to the problem and vaguely asked for any suggestions (at least that's how I see it). The problem on hackerrank didn't change in the meantime though. \$\endgroup\$tkit– tkit2021年04月10日 20:01:11 +00:00Commented Apr 10, 2021 at 20:01
-
\$\begingroup\$ Welcome to Code Review, to make you better acquainted with code review please read How do I ask a good question? and How do I write a good answer. The answer is acceptable because you made the observation that the code could be simpler. There are a lot of questions like this on code review. \$\endgroup\$2021年04月11日 12:23:33 +00:00Commented Apr 11, 2021 at 12:23
Explore related questions
See similar questions with these tags.
ConstructHashedAnagramsDictionary
from N^3 to N^2, can you figure out how? \$\endgroup\$