soxfox

We’re past the 25% mark of #48in24, and this week’s featured exercise is Scrabble Score. The idea is simple: given a word (any string counts, not just valid words), calculate the number of points it would be worth in Scrabble, with no point bonuses included.

More specifically, we need to:

  1. Iterate over the characters of the string.
  2. Find the score for each one, ignoring case.
  3. Sum all the letter scores.

Let’s get into it!

Languages

Python

Python was the very first language I used in this series. It’s a very popular language, and pretty easy to learn, but it contains some very powerful features as you learn a bit more about it. One such feature which I’ll use here is comprehensions, which provide the same functionality as typical map and filter functions with a built-in syntax.

The particular form of comprehension I use here is called a generator comprehension, which is a lazy comprehension that doesn’t compute its elements until they are required. Using a generator comprehension and sum, I can solve this exercise in just one line:

def score(word): return sum(SCORES[letter] for letter in word.upper())

There you go! It’s written almost in plain English – return the sum of the score of each letter in the uppercased word. …what’s SCORES? Don’t worry about that. It doesn’t matter.

SCORES = {
    'A': 1,
    'B': 3,
    'C': 3,
    'D': 2,
    # ...
    'W': 4,
    'X': 8,
    'Y': 4,
    'Z': 10,
}

…fine, I guess 97% of the solution is just this dictionary that defines the score of every letter. Obviously it had to exist in some form, so how about you just ignore it and marvel at the beauty of the one-liner again!

def score(word): return sum(SCORES[letter] for letter in word.upper())

Beautiful.

Scheme

It’s another Lisp! It’s been a while since the last featured Lisp in #48in24, but Scheme is here to fix that. Created at MIT, Scheme provides some excellent features that go beyond most Lisp dialects, such as hygienic macros and a very powerful numeric tower that supports arbitrary precision rationals.

Scheme is typically written in functional style, so my solution here will work similar to the Python one, but that doesn’t mean there’s nothing new to explore. First, I changed how the score lookup works, which makes it easier to convert from the table provided in the exercise instruction. It now lives in a function, and uses Scheme’s case expression:

(define (score-letter letter)
  (case (char-downcase letter)
        ((#\a #\e #\i #\o #\u #\l #\n #\r #\s #\t) 1)
        ((#\d #\g)                                 2)
        ((#\b #\c #\m #\p)                         3)
        ((#\f #\h #\v #\w #\y)                     4)
        ((#\k)                                     5)
        ((#\j #\x)                                 8)
        ((#\q #\z)                                 10)))

Neater, and easier to compare with the source information than the Python version. The score function is also pretty similar:

(define (score word)
  (apply + (map score-letter (string->list word))))

There’s a manual conversion from string to list, because strings can’t be used with map, and there’s an interesting trick for summing the numbers. At first, I used reduce to do this, but I realised that the + function in Scheme can already take as many arguments as you like. Using apply is like calling the first argument (+) with the second argument (result of the map), which should be a list, as its arguments. So if the map evaluates to (4 1 1 1 1), this becomes (+ 4 1 1 1 1), finding the sum of those 5 numbers.

C

The final featured language this week is C, which has appeared previously in both week 1 (as a bonus), and week 8.

C doesn’t provide the same sort of useful functional tools that I used in Python and Scheme, so this solution will need to have some explicit looping. It also doesn’t provide a dictionary like Python, and while it does have a switch-case structure that could be used like Scheme’s case, it’s a bit verbose. After the not-actually-one-liner in Python, I wanted to see how compact I could get the score structure.

Initially, I just shoved all the scores into an array. Then, I can just compute letter - 'a' to get an index into that array and get the score. But that’s not as compact as it could go, and not nearly crazy enough.

The maximum score of a letter is 10, which fits in 4 bits, so packing all the scores together results in a 104-bit string… and GCC supports a 128-bit integer type. I realised that I can treat a large number like an array by just bit-shifting and masking the result.

So, with that crazy idea in mind, let’s get started:

__extension__ typedef unsigned __int128 uint128;
uint128 letter_scores = 13034349334577425613786974331697;

To use GCC’s __int128 type on Exercism, which compiles with -Wpedantic enabled, you need to use the __extension__ keyword. I put this in a typedef for convenience, so that I can just use uint128. Just to make sure everything’s alright with this crazy integer, I’ll build it:

./scrabble_score.c:2:25: error: integer constant is too large for its type [-Werror]
    2 | uint128 letter_scores = 13034349334577425613786974331697;

Huh. Huh? As it turns out, you can’t just use a 128-bit literal in GCC, at least not on x86_64. At first I thought I’d have to give up on the idea, but it turns out this isn’t a limitation, this is just a chance to get even crazier.

uint64_t letter_scores_arr[] = {3536193777322500913, 706593493274};

Wait, that’s just a boring array, what happened to the big number? This is actually still the same number, arranged in a specific way. The first item in the array is the last 64 bits of the original number, and the second item is the first 64 bits. Because the tests run on a little endian architecture, this has exactly the same layout in memory as the 128-bit int would have, so if I can just find a way to treat this array as a uint128, it should work the same.

So that’s exactly what I did:

unsigned int score(const char *word) {
    uint128 letter_scores = *(uint128 *)letter_scores_arr;
    unsigned int total = 0;
    while (*word) {
        int index = tolower(*word++) - 'a';
        total += (letter_scores >> index * 4) & 15;
    }
    return total;
}

By treating the array address as a pointer to uint128, then dereferencing, I have the original number back! (This probably violates the strict aliasing rule, so don’t do it.) From there, it’s a simple matter of bitshifting it by 4 times a letter’s index, and masking off just the low 4 bits. Do this for each letter, and we get the full score!

How to Make Big Numbers

How did I actually get to those massive numbers in the C solution though? First, it’s best to understand the structure that they need to have. When looking up the score for the letter “a”, the number is not shifted at all, and the lowest 4 bits are used. For “b”, it’s shifted over by 4, then the new low 4 bits are used.

This means that the number must include the score of “a” in the lowest 4 bits, then “b” in the next 4, and so on. Additionally, for the array workaround, the lowest 64 bits must be written first, in order to end up with the right binary representation in memory.

To create the numbers, I used Python, reusing the SCORES dictionary from earlier. Python supports arbitrary precision integers, so I don’t need to worry about size, and the simplicity of Python makes it easy to put together quick temporary scripts like this.

number = 0
for score in reversed(SCORES.values()):
    number = number << 4 | score
print(number)
print(number & ((1 << 64) - 1), number >> 64)

I handled the bit order by shifting in each score, starting with Z and continuing all the way to A. By the time A’s score is added, Z will have been shifted 100 bits to the right, putting it in the correct location. To split the number into two 64-bit values, I just manually shifted and masked the right bits.

Summary

From a one-liner* in Python, to a very silly C solution, I think I was able to get a lot out of quite a simple task here. I’m a little surprised that a task like this came so quickly after one like Sieve, but sometimes the simple tasks can be fun to explore. As usual, solutions in Python, Scheme, and C are all published on Exercism, and thank you for reading!

* Just pretend SCORES doesn’t exist.