Experimenting with CSS "Point Clouds"

I made a thing! A self-portrait illustration that mimics a point cloud aesthetic. I liked the result, but to be honest, I enjoyed the process of making it even more, so buckle up and join me while I put together some of my notes.

The result is on my website, and I also have more examples here. It is made with box-shadows and CSS animations.

I have experimented with box-shadow animations before. Here is a cute purple Yoshi animated by mapping each pixel to a box shadow and changing them with CSS.

The resulting code of doing an animation with that technique is MASSIVE as each pixel is represented with a unique box-shadow of a 1px element; here is a fragment of the code:

[...]
box-shadow: #000 3px 54px, #000 3px 57px, #000 6px 54px, #6868d8 6px 57px,
  #000 6px 60px, #000 6px 63px, #000 6px 66px, #000 9px 51px, #000 9px 54px,
  #000 9px 57px, #f8f8f8 9px 60px, #f8f8f8 9px 63px, #f8f8f8 9px 66px,
  #000 9px 69px, #000 9px 72px, #000 9px 75px, #000 12px 51px,
  #f8f8f8 12px 54px, #f8f8f8 12px 57px,
[...]

But has the advantage it that can be animated with @keyframes and the browser will animate and interpolate each shadow. It is an interesting experiment but not something I would recommend for more than a few pixels. More info.

Anyway, since then I have been thinking about making something more complex. Maybe I could make a point-cloud like thing? I could even simulate depth by moving points at different depths at different speeds. I decided to try it.

Getting and Parsing the data

And first things first, I had to get a source 3D point cloud. I tried different apps but my favorite is Capture for the iPhone. The app uses the frontal camera and exports a .usdz file.

I don't know how to interpret USDZ files but I remember OBJ files can be ASCII, after converting the .usdz file online, I got an .obj file that included vertex information and coordinates like this:

v 15.55977058410644528 -3.08242511749267578 -53.03444290161132812
v 15.44011497497558592 -3.14228510856628418 -53.07048416137695312
v 15.44011497497558592 -3.0152900218963623 -53.01507186889648437
v 14.469059944152832 -5.0644221305847168 -53.90859985351562496
v 14.35047054290771484 -5.12238502502441406 -53.95074462890624992

I can use this! I wrote a javascript function that reads the file, finds every line that starts with v , and puts the coordinates in an Array. I had to do some cleanup like translating everything, so the minimum value of each axis is 0. Then I tested the data by rendering points on a blank HTML page, and it worked! I got this image.

That looks like something already. Parsing the .obj every time is too slow, so the next step is to do all the calculations needed and store everything in a slimmer format.

Compressing

The .obj file was 19.9mb so way too big for web usage. The first ideas were to resize the image, reduce the precision, and discard some points. But that was not enough. I wanted this on my website and to make it worth it I decided to set a goal of making it less than 45kb after gzippingg to be able to replace my header picture without making my website heavier.

I explored different ideas, the first thing that I noticed was that I do not need as much depth resolution, I have no plans to implement free camera movement so just a few depth levels should be enough for the parallax.

Here are some of the ideas that I tried:

X,Y,Z

About 8 bytes per point.

My first idea was to save the data as x,y,z coordinates and have z be a value from 0 to 9. Like this:

8,264,0 8,265,3 8,268,0 8,269,3 8,278,4 8,279,0

But it was still very out of the size budget.

X,Y and use point order as depth

About 6 bytes per point.

Then I got another idea: Drop the depth value completely! Use only x,y and order the values from near to far. This changes the data as the distribution of depth is lost, but I tried anyway. The new format looked like this:

8,278 8,265 8,268 8,269 8,279 8,264

Which was much better! but still pretty big.

X,Y but X and Y are restricted to 0-255 and thus 1 byte each

2 bytes per point.

Those numbers looked interesting... given the size that I was trying to use, maybe I could limit the image size to 256x256 to fit each coordinate in one byte, and since I know I'm only sending pairs of values, I could also drop the commas and spaces:

XYXYXY

Where X and Y each hold a value between 0000000 and 11111111 expressed as a single character, that means I only need to spend 2 bytes per point. Advantages: Much more compressed: Disadvantages. Now the image size is limited to 256x256 pixels. Also, the distribution of depth is lost. All I have is the order from "this is the closest point" to "this is the farthest point". I played with this but not every 3D model looked good, so I had to keep exploring.

Only keep depth, use position as coordinates

About HALF byte per point (but I need to pay for empty ones).

So I thought, having the order of the points as depth made me lose depth distribution and also gave me too much depth resolution at the same time. Maybe I could invert the values. So I did. Instead of saving the coordinates, I would only keep the depth value and will use the position of the depth value to determine the X,Y coordinates.

With this approach, I would need to pay for pixels with no value, but I could make it 1 byte per point, so if at least 50% of the pixels are occupied I would still have a smaller file than before that used 2 bytes per point. Nice. With this approach, I had to pay one level of depth to signal empty pixels, so I had 255.

And one more thing, after doing some experimentation I found that 255 levels of depth were too much so I reduced them to 15, which need only 4 bits per point so I could fit two per byte, half a byte per point.

Also with this method, I don't have image size limits, and for the size that I wanted to use the files were about 65kb so good enough.

Encoding and Parsing

After parsing the original .obj file, I had a matrix of the depth of each point. Then I updated all the matrix values to go from 0 to 15. I'm ready to generate my binary file.

I encoded the binary string like this:

// here "Db" is the matrix of each coordinates
const byteNumber = (Db[i][k] << 4) + Db[i][k + 1];
binaryStr += String.fromCharCode(byteNumber);

Db[i][k] and Db[i][k + 1] are two contiguos points. Since the values are maximum 15, Db[i][k] << 4 moves the binary representation of the value by 4 positions, giving enough space for Db[i][k + 1] to fit while still fitting in 8 bits together. Maximum value (15 << 4) + 15 is 255.

Here is an example of encoding 11 and 5 together:

11 is 1011 in binary
 5 is 0101 in binary

So to have them both in a single byte we want to get 10110101

First we shift the first number 4 positions
`11 << 4` is 176 (and 176 in binary is 10110000)

Now the smallest 4 bits are vacant and
we can add our second number to that value
(11 << 4) + 5 = 181

And 181 binary representation is the number we are looking for:
10110101

And to decode the value, I would use charCodeAt instead of fromCharCode and shift values to the right to get the first point and subtract the first point to get the second.

And that is for the encoding. And success! first 3D scan is now 19.8kb after gzipping well below the goal of 45kb.

Animating

Going with the same approach that I had in mind of animating each point as a unique shadow of a 1px element was proven too slow immediately. But! Making the shadows in 15 different elements, one per depth level, and animating each layer was fast enough. Here is a view of the depth levels:

And with that, I was pretty much done.

One time testing, I chose the load the binary file from the wrong path and got something like this:

And it gave me a lot of missigno vibes.

The parser I made is not smart enough to know when it got bad data, so it parses everything you throw at it.

Next Steps?

Some ideas that I want to implement to improve the rendering: