Mechanical March is upon us, so the next few weeks should be full of compiled featured languages, with at least one that compiles to native machine code per week. This week’s exercise Circular Buffer is a data structure based one, where the goal is to build a general purpose data structure with a set of functions to manipulate it. In this case, as the name suggests, we’re building a circular buffer.
Circular Buffers
Circular buffers store a sequence of items. Items can be inserted at the end of a circular buffer, and taken from the start. So far, this is just the definition of a queue, and circular buffers are in fact one way to implement a queue.
What makes circular buffers unique is the way that they use a fixed buffer in memory, without copying data around as items are added and removed. The part of a circular buffer that is used can start anywhere within the underlying memory, and when you try to read past the end, it should wrap back around to the start. For instance, the following buffer contains the items 1, 2, 3, and 4, even though they are split up differently in memory:
Many community solutions I have seen for the Circular Buffer exercise do not actually use a circular buffer, because the tests will pass just fine using any other implementation of a queue, and they cannot detect when a more straightforward approach is used in a high level language.
Languages
C
First up is C, a language which has been around for over 50 years, and is essentially the de facto standard compiled language. Programming in C requires a lot of thought about managing memory usage, and this is very apparent in the implementation of a circular buffer.
C requires header files, typically with the extension .h
, in order to know what functions are defined elsewhere. A header file contains just function prototypes without implementations, so that another .c
file can include the header with #include
and reference the functions and use the types declared within.
Exercism usually doesn’t provide full header files in the C track, instead leaving students to determine the necessary interface from the test file. This is the header file I created for Circular Buffer:
#ifndef CIRCULAR_BUFFER_H
#define CIRCULAR_BUFFER_H
#include <stddef.h>
typedef int buffer_value_t;
typedef struct circular_buffer circular_buffer_t;
circular_buffer_t *new_circular_buffer(size_t capacity);
void delete_buffer(circular_buffer_t *buffer);
void clear_buffer(circular_buffer_t *buffer);
int write(circular_buffer_t *buffer, buffer_value_t value);
int overwrite(circular_buffer_t *buffer, buffer_value_t value);
int read(circular_buffer_t *buffer, buffer_value_t *ptr);
#endif
Because handling generic data types in C is more challenging, the creators of the C track decided to limit this exercise to just storing int
s, so I create a type alias buffer_value_t
for int
, along with declaring the circular_buffer_t
type – the contents of that struct are defined in the C file.
The first three functions are used to create and destroy a buffer, along with resetting its state to clear it. These functions also need to explicitly allocate the memory for the circular buffer to use, both for its internal state and the full data buffer. The other three functions are the three main operations that need to be provided on the buffer. write
adds an item to the end of the buffer, read
takes one from the start, and overwrite
functions like write
, except if the buffer is full it will overwrite the earliest item.
In the .c
implementation file itself, the first thing I did was define the circular_buffer
struct fully:
struct circular_buffer {
size_t capacity;
size_t write;
size_t read;
bool full;
buffer_value_t data[];
};
In this implementation, I chose to use two indices marking the current read position and the current write position. This unfortunately causes issues with empty and full buffers. In both cases, the read and write positions will be the same, so I used an extra field bool full
to resolve the ambiguity. The data is stored in the struct using a “flexible array member”, which is a C99 feature. This allows me to allocate the header containing the internal state and the buffer_value_t
buffer at the same time.
I implemented the first three functions as follows:
circular_buffer_t *new_circular_buffer(size_t capacity) {
circular_buffer_t *buffer = malloc(sizeof(circular_buffer_t) + sizeof(buffer_value_t) * capacity);
buffer->capacity = capacity;
clear_buffer(buffer);
return buffer;
}
void delete_buffer(circular_buffer_t *buffer) {
free(buffer);
}
void clear_buffer(circular_buffer_t *buffer) {
buffer->write = 0;
buffer->read = 0;
buffer->full = false;
}
new_circular_buffer
calls malloc
(pronounced “mal-ock”) to allocate the memory for the structure, including its data as mentioned above. It then stores the capacity for later, uses clear_buffer
to reset the rest of the state, and returns the pointer that malloc
created.
delete_buffer
is a simple call to free
to deallocate the memory, and clear_buffer
just sets the rest of the buffer state to indicate it is empty – the data is technically still in data
, but won’t be accessible through the buffer functions.
Because write
and overwrite
are very similar, I chose to create a single shared implementation that is used by each, with a flag to indicate if overwriting should be allowed:
static int write_impl(circular_buffer_t *buffer, buffer_value_t value, bool over) {
if (buffer->full && !over) {
errno = ENOBUFS;
return EXIT_FAILURE;
}
buffer->data[buffer->write++] = value;
if (buffer->full) {
buffer->read++;
}
buffer->read %= buffer->capacity;
buffer->write %= buffer->capacity;
buffer->full = buffer->write == buffer->read;
return EXIT_SUCCESS;
}
- If
over
is not set, check whether the buffer is full (full
flag).
If it is, report the error by settingerrno
and returning a failure status. - Write the value into the current cell according to the
write
index, and increment that index. - If the buffer was already full, the
read
index should also be incremented so it still points to the oldest item. - Wrap both
write
andread
indices to fit in thecapacity
. - Update the
full
flag. If the read and write pointer are the same after inserting an item, the buffer must now be full.
write
and overwrite
themselves are of course just simple wrappers around this:
int write(circular_buffer_t *buffer, buffer_value_t value) {
return write_impl(buffer, value, false);
}
int overwrite(circular_buffer_t *buffer, buffer_value_t value) {
return write_impl(buffer, value, true);
}
Finally, read
is similar to write
, but without needing to worry about “over-reading”, because that’s just plain silly.
int read(circular_buffer_t *buffer, buffer_value_t *ptr) {
if (buffer->write == buffer->read && !buffer->full) {
errno = ENODATA;
return EXIT_FAILURE;
}
buffer->full = false;
*ptr = buffer->data[buffer->read++];
buffer->read %= buffer->capacity;
return EXIT_SUCCESS;
}
There’s a similar empty buffer check, we always clear full
(how can a buffer be full if we just removed from it?), and the data gets stored to a caller-provided pointer.
You can find my full solution here. Wow, this article is already longer than last week’s, and I have three more languages to go. Why three? Well,
Gleam (Bonus!)
it just so happens that I already solved Circular Buffer in C, less than a month before it would have counted for #48in24. So to get that sweet golden medal, I had to throw in a bonus language (for the second time)! My language of choice was Gleam. I covered Gleam a mere two weeks ago, but I liked it a lot and Gleam just released version 1.0, so I felt like bringing it back.
But now comes the sad part. I had to cheat. Gleam doesn’t have arrays at all, they wouldn’t work well with its immutable data model, since the entire array would need to be copied. So Circular Buffer is “impossible” in Gleam, and we have to settle for some other queue implementation.
Now, List Ops in Gleam was all about not using the standard library, so this time I’ll focus on using the standard library as much as possible. First, we need a custom type to represent our not-really-a-circular-buffer, and a constructor to go with it:
pub opaque type CircularBuffer(t) {
CircularBuffer(capacity: Int, queue: Queue(t))
}
pub fn new(capacity: Int) -> CircularBuffer(t) {
CircularBuffer(capacity, queue.new())
}
Making the type opaque
ensures that code outside of the module won’t be able to peek inside and manipulate the data in odd ways. The type contains two fields – capacity
, which stores the maximum amount of items in the queue, and queue
, a Queue
from the queue
module, which is created in the constructor with queue.new()
.
Next, I implemented read
like this:
pub fn read(buffer: CircularBuffer(t)) -> Result(#(t, CircularBuffer(t)), Nil) {
let CircularBuffer(capacity, queue) = buffer
use #(item, queue) <- result.map(queue.pop_front(queue))
#(item, CircularBuffer(capacity, queue))
}
There’s a bit going on here, so I’ll break things down bit by bit.
CircularBuffer(t)
: This was used in week 6 as well, but the lowercaset
here, or any other lowercase identifier used in a type name, is used for generics. Every time the same identifier shows up in a particular context, it refers to the same type, so the generic buffer type affects the return value of this function.Result
: This type has two variants –Ok
, which holds a value of the first generic type and indicates a successful result, andError
, which holds a value of the second generic type and indicates that something went wrong. In this case, the error type isNil
, which means we store no information about what went wrong.#(t, CircularBuffer(t))
: I’ve already mentioned the generic part of this, but this is a tuple, allowingread
to return multiple values. Specifically, whenread
returns anOk
variant, it will return the item read from the queue, as well as the updated queue (Gleam doesn’t have mutable types, so we need to return this).let CircularBuffer
: Pattern matching works on custom types too.use
: Nope, nope, nope, nope, nope. I can’t do this in this list,use
gets it’s own section.
Gleam’s magic keyword
(this section is based on Gleam’s use
announcement)
use
is a pretty clever keyword, something I just haven’t seen in other languages (Haskell monads are a bit similar?). In Gleam, it’s common to write functions that take a callback as their last argument, and the standard library has many of them. gleam/result
contains 7 such functions, notably including result.try
and result.map
, which allow you to run code only if a previous Result
was Ok
. Some similar tools exist in gleam/option
, there are of course common functional programming features that work similarly in gleam/list
, and even gleam/bool
has a helpful function in this form (which I’ll use soon).
Unfortunately, when you start chaining these callback-based functions, you quickly reach what JavaScript developers refer to as “callback hell” – a horrible triangular (from the indentation) mess of nested function in nested function. Initially, Gleam came up with a way to make working with results more ergonomic, as this was a common place that the problem might arise. It was the try
statement, a simple way to unwrap an error if everything is Ok
, or return its Error
value otherwise:
pub fn login(credentials) {
try user = authenticate(credentials)
try profile = fetch_profile(user)
render_welcome(user, profile)
}
Which is fine for most cases, but what if you do want to start nesting other callback functions? In Gleam 0.25, a new solution was created: use
. Rather than handling Result
s specially, use
is general purpose syntactic sugar for callbacks. Code that looks like this:
pub fn main() {
use file <- with_file("pokemon.txt")
write(file, "Oddish\n")
write(file, "Farfetch'd\n")
}
is rewritten to this code:
pub fn main() {
with_file("pokemon.txt", fn(file) {
write(file, "Oddish\n")
write(file, "Farfetch'd\n")
})
}
Everything after the use
expression becomes the body of a callback that gets passed to another function. This leads to highly flexible custom control flow, with just a single syntax addition.
Back to my implementation of read
:
pub fn read(buffer: CircularBuffer(t)) -> Result(#(t, CircularBuffer(t)), Nil) {
let CircularBuffer(capacity, queue) = buffer
use #(item, queue) <- result.map(queue.pop_front(queue))
#(item, CircularBuffer(capacity, queue))
}
First, I use pattern matching to get the current capacity and queue. Then, I call queue.pop_front
to take the next item from the queue. If there are no items this returns Error(Nil)
, which should be the final result. Calling result.map
(with the use
syntactic sugar) lets me return Error
values as-is, while updating an Ok
value. In this case, I just wrap up the resulting queue in a new CircularBuffer
.
pub fn write(
buffer: CircularBuffer(t),
item: t,
) -> Result(CircularBuffer(t), Nil) {
let CircularBuffer(capacity, queue) = buffer
use <- bool.guard(queue.length(queue) == capacity, Error(Nil))
Ok(CircularBuffer(capacity, queue.push_back(queue, item)))
}
write
shows off another use of use
– bool.guard
. As we’ve already seen, Gleam doesn’t have typical control flow, so usual tricks like returning early if a condition isn’t met won’t work. However, since use
is so flexible, bool.guard
implements exactly that. If the condition is true, bool.guard
evaluates to the given value, otherwise the (implicit) callback is run. In this case, it means that we can return an error if the queue is at capacity, then continue with the function knowing that there will be space.
pub fn overwrite(buffer: CircularBuffer(t), item: t) -> CircularBuffer(t) {
use <- result.lazy_unwrap(write(buffer, item))
let CircularBuffer(capacity, queue) = buffer
let assert Ok(#(_, queue)) = queue.pop_front(queue)
CircularBuffer(capacity, queue.push_back(queue, item))
}
Yet another use
here, this time with result.lazy_unwrap
. lazy_unwrap
is similar to map, but instead of running the callback for Ok
values, it runs it if the input value was an Error
. In this case, I first try to write
the item normally, and if that fails, the rest of the function will run. In that case, I remove one item from the front of the queue, and add the new one at the back.
Finally, clear
is nice and simple to finish off with:
pub fn clear(buffer: CircularBuffer(t)) -> CircularBuffer(t) {
CircularBuffer(..buffer, queue: queue.new())
}
The only new thing here is the ..buffer
record update syntax, which allows you to replace just some fields of a record type.
My complete Gleam solution can be found here. That was a pretty big one to cover, especially with my tangent about use
, but I’ve really enjoyed playing around and learning more about Gleam through Exercism.
Groovy
Groovy is a dynamic JVM-based language created by Apache. It is often used as a scripting language, sometimes embedded in other tools such as Gradle or Jenkins.
The implementation I went with in Groovy is pretty similar to my C one, but using object-oriented features without the manual memory management. The Circular Buffer task starts off with two predefined exceptions, as this is how error handling typically works in JVM-based languages.
class EmptyBufferException extends Exception {}
class FullBufferException extends Exception {}
They are pretty self explanatory, and just get thrown from within the various CircularBuffer
methods. The class definition contains the exact same fields as the C version, and again only supports int
s as data:
class CircularBuffer {
int capacity
int writeIndex
int readIndex
boolean full
int[] buffer
}
The constructor and clear()
function should also look very familiar:
CircularBuffer(int capacity) {
this.capacity = capacity
buffer = new int[capacity]
clear()
}
void clear() {
writeIndex = 0
readIndex = 0
full = false
}
read
is pretty close to the C version. The differences are that it throws an exception instead of using errno
and uses a temporary variable rather than an output pointer:
int read() {
if (readIndex == writeIndex && !full) throw new EmptyBufferException()
int value = buffer[readIndex]
readIndex = (readIndex + 1) % capacity
full = false
value
}
For write
and overwrite
, I wrote separate implementations, rather than using the same write_impl
idea:
void write(int item) {
if (full) throw new FullBufferException()
buffer[writeIndex] = item
writeIndex = (writeIndex + 1) % capacity
full = writeIndex == readIndex
}
void overwrite(int item) {
buffer[writeIndex] = item
if (readIndex == writeIndex) readIndex = (readIndex + 1) % capacity
writeIndex = (writeIndex + 1) % capacity
full = writeIndex == readIndex
}
Again though, the logic mostly follows the C version. Check if the buffer is full, write an item into it, move the write index along, and the read index too if necessary, and update full
.
The full code is here, and we have just one more language to go.
Scala
Scala is another JVM-based language, although it is now able to target JavaScript and native executables, which makes it a bit more flexible. There are two major versions of Scala currently in use, Scala 2 and Scala 3. Scala 3 has many syntactic changes, though it is largely backwards compatible and provides migration tools. (Why does this sound so similar to Python’s story?) Currently, Exercism uses Scala 2 in the Scala track, so I’ll be using the more traditional curly-brace syntax supported in that version.
Because Scala isn’t all that different from Groovy, I’ll make things a bit more interesting by trying a different implementation of a circular buffer. Previously, I’ve used two array indices to keep track of the used part of the buffer, but this time I’ll use just one, plus a variable to track the size of the buffer – how many items are currently in it, not the overall capacity.
This also allows me to get rid of the full
variable, as there’s no longer any ambiguity between full and empty:
class CircularBuffer(val capacity: Int) {
var pos = 0
var size = 0
var buf = new Array[Int](capacity)
}
Scala doesn’t require an explicit constructor here, instead we can just define the instance variables and their default values, which is a nice convenience. capacity
is also automatically made into an instance variable because it was defined with val
in the constructor arguments.
def clear() = {
size = 0
}
I think this might be the single simplest snippet of code I’ve written so far for any of these #48in24 posts, and to be honest, I don’t think I can beat it! I don’t need to worry about pos
here, because we can start writing anywhere in the buffer – that’s the whole point of circular buffers.
As for the rest of the functions, they still follow the previous model pretty closely:
def write(value: Int) = {
if (size == capacity) throw new FullBufferException()
val index = (pos + size) % capacity
buf(index) = value
size += 1
}
def read(): Int = {
if (size == 0) throw new EmptyBufferException()
val item = buf(pos)
pos = (pos + 1) % capacity
size -= 1
item
}
def overwrite(value: Int) = {
val index = (pos + size) % capacity
buf(index) = value
if (size == capacity) pos = (pos + 1) % capacity
else size += 1
}
Here are the differences I’d like to highlight:
- I can’t get the write position directly any more, and I need to calculate it from
pos
andsize
, taking into account the wrap-around behaviour. read
has to update bothsize
andpos
, for a similar reason - there’s no way to keep the write position fixed now, as it depends on both.- There’s no need to manage
full
, since I just checksize == 0
andsize == capacity
.
Overall, while it doesn’t reduce the size of the code by much, I think I prefer this position + size approach to the two positions one. It removes an annoying edge case, and generally feels a little easier to understand. The full solution code is here if you want to check it out.
Final Thoughts
I’m happy that Mechanical March has started, and I get to use languages with static typing, more manual control over behaviour, and higher performance, as those are my preferred languages! C is basic but reliable and a nice language to start off the month with. Groovy and Scala both seem like nicer JVM languages than Java itself, and while I don’t usually rely heavily on OOP, it’s a nice fit for custom data structures. It’s a bit of a shame that none of the featured languages this week used generics in their Circular Buffer task implementation, as that makes such a structure infinitely more useful.
I’m glad I got to revisit Gleam so soon, now that I have built up my skills a little more through Exercism. It’s definitely a language I’ll have to keep an eye on, because it’s already introduced some excellent new features like use
, and I’m excited to see where the developers take it now that it’s reached 1.0!
As always, feel free to send feedback to feedback@ignore.soxfox.me
, and I hope you were able to get something out of this (rather long) journey through circular buffers!