Robert Eisele
Engineer, Systems Architect and DBA

Codingame: The Resistance

Original Problem

The Goal

You work in the National Resistance Museum and you just uncovered hundreds of documents which contain coded Morse transmissions. In the documents, none of the spaces have been transcribed to separate the letters and the words hidden behind the Morse sequence. Therefore, there may be several interpretations of any single decoded sequence.

Your program must be able to determine the number of different messages that it’s possible to obtain from one Morse sequence and a given dictionary.

Rules

Morse is a code composed of dots and dashes representing the letters of the alphabet. Here is the transcription of an alphabet in Morse:
A .-B -...C -.-.D -..
E .F ..-.G --.H ....
I ..J .---K -.-L .-..
M --N -.O ---P .--.
Q --.-R .-.S ...T -
U ..-V ...-W .--X -..-
Y -.--Z --..
Since none of the spaces have been transcribed, there may be several possible interpretations. For example, the sequence -....--.-.could be any of the following: BAC, BANN, DUC, DU TETE, ...

A human being can recognize where the segmentations should be made due to their knowledge of the language, but for a machine, it’s harder. In order for your program to do the same, you are given a dictionary containing all of the right words.

However, even with a dictionary, it’s possible that a sequence might correspond to several valid messages (BAC, DUC, DU and TETE might be present in the dictionary of the previous example).

Game Input

Input

Line 1: a Morse sequence with a maximum length L

Line 2: an integer N corresponding to the number of words in the dictionary

The N following lines: one word from the dictionary per line. Each word has a maximum length M and only appears once in the dictionary.

Output
An integer R corresponds to the number of messages that it is possible to generate with the Morse sequence and the dictionary.
Constraints

0 < L < 100000
0 < N < 100000
0 < M < 20
0 ≤ R < 2^63

Solution

The simple part of this problem is that we need a translation from a normal string to a morse string, which can be done with this little JavaScript snippet:

var Alphabet = [
  ".-", "-...", "-.-.", "-..", ".", "..-.", "--.", "....",
  "..", ".---", "-.-", ".-..", "--", "-.", "---", ".--.",
  "--.-", ".-.", "...", "-", "..-", "...-", ".--", "-..-",
  "-.--", "--.."
];

function morse(s) {
  var res = "";
  for (var i = 0; i < s.length; i++) {
    res+= Alphabet[s.charCodeAt(i) - 65];
  }
  return res;
}

We now read in the morse string we want to check against the dictionary and the actual dictionary, which gets translated to morse code instantly. When we're finished, we call the solve method:

var L = readline();
var N = +readline();
var words = new Array(N);
for (var i = 0; i < N; i++) {
  words[i] = morse(readline());
}
print(solve(0, L, {}));

The job of the solve() method is pretty straightforward. We have our morse string L and the dictionary. What we now need to do is starting at the beginning of L and try to find all strings from the dictionary that start with the code. If we've found a string, we skip that word and try to match recursively all strings from the dictionary beginning with that offset. With this approach a lot of duplicate checks need to be done, which makes the problem perfectly cachable. The final implementation then looks like this:

function solve(start, str, dp) {

  if (start === str.length)
    return 1;

  if (dp[start] !== undefined)
    return dp[start];

  var res = 0;
  for (var i = 0; i < words.length; i++) {
    if (str.substr(start, words[i].length) === words[i]) {
      res+= solve(start + words[i].length, str, dp);
    }
  }
  return dp[start] = res;
}

But we can do better. We can create a dictionary of morse codes at the beginning:

var words = {};
for (var i = 0; i < N; i++) {
  var W = morse(readline());
  if (words[W])
    words[W]++;
  else
    words[W] = 1;
}

And apply this little change to the solve method, which makes the execution time an order of magnitude faster. The main idea is to not loop linearly over the dictionary but use it as a lookup table for at most 80 (= M * 4) character substrings:

function solve(start, str, dp) {

  if (start === str.length)
    return 1;

  if (dp[start] !== undefined)
    return dp[start];

  var res = 0;
  for (var i = 1; i <= 80 && start + i <= str.length; i++) {
    var n = words[str.substr(start, i)];
    if (n !== undefined)
      res+= n * solve(start + i, str, dp);
  }
  return dp[start] = res;
}

This really minor change in the way the problem is tackled, reduces the number of iterations from 11,276,136 down to 95,246 for the largest test case!

Go to overview