Driving a large WS2811 LED string with an ATtiny13 and nothing else

After creating code to drive a WS2811/WS2812 LED string with an 8Mhz AVR, some of the comments on that page triggered me to create a version of that code for AVRs that run at 9.6Mhz. In itself this was a lot easier, since these MCUs have more clock cycles per bit available than their 8 Mhz counterparts. However, when talking about 9.6 Mhz AVRs, we're actually talking about the ATtiny13. This MCU is fast, small and dirt-cheap, but is also limited in its resources: 64 bytes of RAM and 1024 bytes (= 512 instructions) flash memory.

Our WS2811 driver, like all others that I've seen, is "memory mapped", which means that for every LED in an LED string the driver requires 3 bytes in memory and every such triplet directly maps onto one of the LEDs in the string. For a 64-byte MCU this means there is a theoretical limit of 21 LEDs in a string. In practice this is even less, because a typical program will use some stack space. I found that a simple color cycle demo could drive at most 13 LEDs. At the same time, most of the demos that I created only light up a few LEDs of the string, so the actual information content is much less than the n-times-3 for n LEDs. Wouldn't it be nice if we could drive such a sparsely lit string with only the bytes we need to describe the LEDs that are actually lit and maybe some bytes describing where these LEDs are?

It turns out that not only is this possible, the resulting driver code is considerably smaller than our original 9.6 Mhz driver code as well! As a bonus, trying to fit an application in 500 instructions had a definite feel of historical re-enactment...

The source code is now part of the regular WS2811 driver on GitHub. Look for the file ws2811_controller_low_ram.cpp for demonstration code.

Driving 60 leds from a 64 byte RAM ATtiny13

Flares demo

Sparse data representation

The new driver reads an array of bytes and sends out a ws2811 serial signal, just like the old one did. However, where the previous driver just read a sequence of green, red, blue-values from memory, the new one expects a sequence of bytes that describe blocks of consecutive LEDs, with for each block:

a byte containing the count of unlit LEDs preceding the block (the "Jump")
a byte containing the count of lit LEDs in the block (the "Count")
for each LED in the block a G, R and B value (three bytes per LED)

Data is terminated by either a zero Jump value or a zero Count value, except for the first Jump, which can be zero if the first LED in the string should be lit.

For example, suppose we want to create the following 20-LED string, where the leftmost LED is the first one:

5 unlit					2 lit		8 unlit								1 lit	4 unlit
black	black	black	black	black	blue	red	black	black	black	black	black	black	black	black	yellow	black	black	black	black

We can represent this with the following buffer contents:

Jump	Count	LED (GRB)			LED (GRB)			Jump	Count	LED (GRB)			Jump	terminate
5	2	0	0	255	0	255	0	8	1	255	255	0	4	0

The memory mapped representation would require 60 bytes, the sparse representation takes only 15 bytes.

Driver code

First of all the code is split in two parts: sending zeros and sending data. Secondly, with 12 clock cycles per bit there is enough time to create a one-loop-per-bit driver. Only the last two bits are unrolled into a two-bit loop. The image below shows the assembly code with the generated waveform to the left. Also on the left is the phase (00-0b for the first bits and 10-1b for the last bit) of each instruction.

The code that emits zero-bits starts at Z00 (Z for zero, 00 for phase 00). The top half of the code is used to emit the first 23-bits of a 24-bit zero sequence. If we find that we're sending the last zero-bit, the code falls through to the second half where the number of data bytes is determined and where the code jumps to s00 where the data waveform is generated.

The code that starts at s00 follows the same structure: the first 7 bits are send in a small loop, phases 00 to 0b. If we're sending the final bit, the code falls through to the second half (phases 10-1b) where we read the number of zeros to transmit and if that is non-zero we'll fall through to z00 again.

Application code

While the sparse representation makes it easy for the driver to send the complete waveform for an LED string in a timed fashion, things have become much harder for most applications. Previously, when an application needed to set an LED at a given position to a given color that would look like this:

rgb leds[60]; // declare a string of 60 LEDs as a simple array of RGB-values.
    // turn the 10th LED purple
    leds[9] = rgb( 255, 0, 255);

Problem is, with the sparse representation we don't know where in the buffer we need to put the data for a pixel at a given position, this depends on the other pixels that are described in the buffer. So either we have to rewrite all applications/demos to create a sparse buffer in one go, or we create a function that can find the n-th pixel in the buffer and that performs all manipulations necessary to make sure that there is space for that pixel inside the buffer.

I chose the latter option, which comes at a cost (code space), but which allows the application code to be almost exactly the same as the memory mapped version. The extra code space is considerate however. The implementation of the function that finds or creates the space for an LED takes more instructions than the driver itself. The new source code becomes:

sparse_leds<38, 60> leds; // declare a sparse buffer of 38 bytes for a string of 60 LEDs
    // turn the 10th LED purple
    get( leds, 9) = rgb( 255, 0, 255);

I shortly tested whether I could turn this in an operator[] to make the syntax exactly the same as the array variant. Although the operator works and does not incur any overhead, I decided against it, because using a free function reminds the programmer (me) that a non-trivial calculation takes place. The compiler doesn't seem to memoize the call to the operator or get()-function either, so it is wise to call this function only when needed and to hold the results when it is used more than once. The library also contains an overload of the get()-function for arrays of rgb-values. This makes it easy to create programs that work for both types of LED string buffers:

rgb leds[60]; // declare a string of 60 LEDs as a simple array of RGB-values.
    // turn the 10th LED purple
    get( leds, 9) = rgb( 255, 0, 255);

Downloading the demo code

The sparse and 9.6Mhz code is now integrated in the regular Github repository. The attiny13 demonstration code starts in ws2811_controller_low_ram.cpp. Examples of demo code that runs on both the sparse buffers as on regular memory-mapped buffers are in flares.hpp and in chasers.hpp.

Comments? Questions?

{{#set: |Article has average rating={{#averagerating:}} }} {{#showcommentform:}}

{{#ask: Belongs to article::Driving a large WS2811 LED string with an ATtiny13 and nothing else Modification date::+

 | ?Has comment person
 | ?Has comment date
 | ?Has comment text
 | ?Has comment rating
 | ?Belongs to comment
 | ?Comment was deleted#true,false
 | ?Has comment editor
 | ?Modification date
 | ?Has attached article
 | format=template
 | template=CommentResult
 | sort=Has comment date
 | order=asc
 | link=none
 | limit=100

}}