No matter, for which company you are giving the technical interview, this question can be asked. It seems to be so common that even in the Manager round, it can be asked.
Question : You are given a String and a number 'top' like 0, 1, 2...n
Now you need to find the 'top' most occurring characters in the given String.
Initial Thought : It is a simple task. Just get the count of each character & then pick the most
occurring ones. It is useless question not used in work in general.
Experienced Thought : What if more than 1 character are having the same number of
occurrences? Do we expect any other character not in ASCII?
Approach to solve such question can be used to get the information
about the most used nodes or systems in the infrastructure or the
most visiting users etc.
So in short, such seemingly simple questions are not so simple or useless most of the times.
Now, lets not complicate the question in hand uselessly.
Below I will be giving 4 ways to solve it where we are given a String & we need to find 'n' top most occurring characters in the given String.
Assumptions - 1) Only ASCII characters are present.
2) If there is a tie then give all such characters.
Still I will suggest to clarify above assumptions, though 99% interviewers will agree on the above, as s/he is also having the similar picture of the expected solution.
Note - For this case, I am not creating any 'Record' or 'Class' to hold this information, as per my understanding, creating Record or Class is going to degrade the code performance only. But in real situations like to find top used systems/nodes or top visitors, you will either be having class or may need to create the class for these, but approach will remain similar to the approaches given below-
First Approach : Using Java8 streams, as interviewers have fantacy for streams & FP, but I will suggest to use Streams if you are dealing with huge data & have required functions in place to use in your FP, else in general, Streams are going to degrade your code performance.
Question : You are given a String and a number 'top' like 0, 1, 2...n
Now you need to find the 'top' most occurring characters in the given String.
Initial Thought : It is a simple task. Just get the count of each character & then pick the most
occurring ones. It is useless question not used in work in general.
Experienced Thought : What if more than 1 character are having the same number of
occurrences? Do we expect any other character not in ASCII?
Approach to solve such question can be used to get the information
about the most used nodes or systems in the infrastructure or the
most visiting users etc.
So in short, such seemingly simple questions are not so simple or useless most of the times.
Now, lets not complicate the question in hand uselessly.
Below I will be giving 4 ways to solve it where we are given a String & we need to find 'n' top most occurring characters in the given String.
Assumptions - 1) Only ASCII characters are present.
2) If there is a tie then give all such characters.
Still I will suggest to clarify above assumptions, though 99% interviewers will agree on the above, as s/he is also having the similar picture of the expected solution.
Note - For this case, I am not creating any 'Record' or 'Class' to hold this information, as per my understanding, creating Record or Class is going to degrade the code performance only. But in real situations like to find top used systems/nodes or top visitors, you will either be having class or may need to create the class for these, but approach will remain similar to the approaches given below-
First Approach : Using Java8 streams, as interviewers have fantacy for streams & FP, but I will suggest to use Streams if you are dealing with huge data & have required functions in place to use in your FP, else in general, Streams are going to degrade your code performance.
Using Java Streams and casting each elememnt to char
Second Approach : In above approach we are type casting every item to char, so I thought to avoid that & do the type casting only for those elements which are part of the result. I thought that this will improve the performance but let see this later.
Using char stream and type casting in the result only
Third Approach : This is the one which will be coming first to solve the problem, as it is the easy one to visualize & seems to be easy one. But the challenge comes when you need to make the code efficient & manageable & easy to understand.
Lets see the below code-
Lets see the below code-
Using TreeMap to keep the elements sorted in reverse order
Fourth Approach : Lets try to improve above code, if it helps.
Note - Below code is specific to the current problem in hand with assumption that we have only ASCII characters else below code is not applicable-
Note - Below code is specific to the current problem in hand with assumption that we have only ASCII characters else below code is not applicable-
Using TreeSet
I tried to execute all the above approaches for the String of length 100_000_000 & fetched top 3 most occuring characters.
Below is the result about how much time each approach took to get the result-
Below is the result about how much time each approach took to get the result-