Mysterious Sparkling Bug

I’ve been noticing for a while an occasional bit of “pixel sparkling” on some of the walls.


Those stray yellow pixels, which are the color of the next room over. My theory is that it’s small gaps in my triangle mesh, maybe due to arithmetic round-off error. Ok! Let’s tweak the shader to show those triangles.


Yup, that’d be it. When building the mesh, the geometry is initially only decomposed into rectangles. I notice: There are never sparkles on the rectangle halves. Great! I’ll just expand each rectangle slightly.

Rooms in Metareal are generally about 10x10x10. (I think of it as meters… but it’s just “units”.) When I expand each rectangle by 0.01 (a centimeter? sort-of?) it fixes all the sparkles, sure enough, but makes a mess of the corners.


I wonder: How “big” are these sparkles? As an experiment, I shrink all the rectangles by 0.0001 (a tenth of a millimeter?) and sure enough… well… take a look.


And expanding each rectangle by 0.0001 seems just right. No visible corner-mangling or overlap, and, as near as I can see, all sparkles gone.


And some of the code that drives this…

// Build the four corners of the rectangle
    MeVec3 p00 = wVector + sVector * sLo + tVector * tLo;
    MeVec3 p10 = wVector + sVector * sHi + tVector * tLo;
    MeVec3 p01 = wVector + sVector * sLo + tVector * tHi;
    MeVec3 p11 = wVector + sVector * sHi + tVector * tHi;

// we like to expand the rectangle a tiny bit, we need normalized s&t
    float expandAmount = 0.0001;
    MeVec3 s1Vector = sVector.normalize() * expandAmount;
    MeVec3 t1Vector = tVector.normalize() * expandAmount;

// apply the expansion        
    p00 += -s1Vector -t1Vector;
    p01 += -s1Vector +t1Vector;
    p10 += +s1Vector -t1Vector;
    p11 += +s1Vector +t1Vector;

Always nice to resolve an ongoing irritation.

8-bit Collision

In process of completely revising the collision engine.


It’s a simplified physics model, much like sliding-block physics of, say, Pengo from the 80’s. All colliders are axis-aligned boxes. Pushes can be transitive. And each movement has a force associated with it, which determines how much it can push, or if a wall will stop it.

The first version was prone to silly troubles like allowing penetration of side walls in some tight corridors. A future blog post shall give excruciating detail!

Inspiring Bug

Was algorithmically generating the basic framework of 512 equal-sized cube rooms. But this bug…



Looks pretty cool. Perhaps one or more of the eight lobes will be a more free-form volume of panels and dividers. Potentially much more disorienting!

Editor & Debugging

Just a quick reveal of the Editor, and some ways it helps development.

This whole project is ad hoc, bespoke, and DIY. Optimizing time, features, and function is mandatory. To this end, the Game and the Editor are one and the same. The Editor features will be disabled for a release build, but that’s just a detail.

The Editor features are for editing, certainly, but also for debugging and visibility. Below is an animation showing

  • parameter editing, including a bitmap editor,
  • object type changing,
  • and memory checking, confirming that edit operations don’t leak.

(Click it to go big.)

Beginnings of Culling

Both my laptop and my desktop seemed relatively happy with 1,000,000 triangles per frame, in several materials. And the entire world will probably come in around 750,000 triangles. So that’s great!


Not everyone has hardware from 2014 and 2015. And, anyway, for certain optical effects I need to do between 3 and 5 renders per frame. So, can’t just render the whole world every frame.

So I’ve started adding culling to the engine. Here’s a quick test where couple of hundred parts (individual object meshes) are hidden and shown per frame. Managing those lists can be expensive, but it can be easily trickled out across multiple frames as needed.

Sadly, QuickTime Player screen capture is somewhat low-quality. Seeking better screen capture solution…

Moments in Coding

You know what is scary? I will tell you a thing that is scary.

When you perform a somewhat spanning refactor —

Well let me back up a moment. I have a general approach to coding, and other things, which goes like this:

  • Charge ahead, make some progress without worrying too much about long-term consequences.
  • Step back and look at the results.
  • Now I am an expert. An expert on one very very tiny field of knowledge, but one that is immediately applicable to my needs…
  • Double back to step one, and revisit it. It may feel like “not making progress”… but with experience you know that certain things really do lead to long-term benefits. Eat your vegetables and whole grains, kid.

The scope of these revisitations varies. Sometimes I’ll bang out a solution, say, Hooray, and revert my files to do it again in the same hour. Other times I’ll go back a month later and refactor while keeping tests passing.

Where were we. Yes, earlier this week I was revisiting the base class for all Metareal objects (some would call them “game objects”). After spinning up the editor, and implementing four or five of them, I had a pretty good idea of what the life cycle should be, and a better idea of the distribution of responsibilities between World, Level, Editor, and Object.

When I started banging out the code, I didn’t really even have those names. The Level object was handling some world duties, and some Editor duties. And the Editor was handling SDL events directly, and so on. What they now reboiled down to is:

  • Editor. One of several possible top level apps. Instantiates a Level and a World. Gets first crack at UI events, passes them to the World sometimes.
  • World. Owns the list of objects. Owns the Level (after Editor loads it and gives it to the World). Responds to ticks, and owns the renderer.
  • Level. Turns files into object lists, and vice versa.
  • Object. Receives in-world events like player-touches and ticks and messages. May have renderable portions, or volume or collision presence. Has some “Editor-only” affordances like debug-info messages, getTriangleCount().

Nothing too radical, but took a little while to settle into this orderly form. So this refactor I was doing, I had a pretty clear idea, and some notes, and I dug into it. Lots of temporary ifdef-ing, and new unit tests along the way. Which brings us to the posting topic.

The thing that is Scary

Sometimes a thing is scary. It makes me question the plausibility of this whole endeavor. A person can’t write a big computer program… it can’t be done!


After laying out all the parts on the workbench, and 15 hours of coding, iterating, testing, everything was working again, looking good. Except the CPU and frame rate were all wrong. Previously, I could run a million triangles and fifty thousand collision zones at 12% and 60 FPS. But now it was slogging. The CPU was erratic. 15%. 45%. Debugger break, go, and back to 20%. And the frame rate would zoom to 60 (Hooray!) and then stumble to 38 (Wafna! Wafna!).

I ran the old version on a laptop side by side. The performance counters all matched reasonably: the same amount of “work” was being done, as expected. The memory footprint was comparable. Still no memory leaks.

But I had rejiggered quite a bit. What had changed? I kept profiling and iterating. Nothing. How could this be??


Finally I rolled back to the previous version on the same computer… and saw that the old version also was behaving all wrong. Then I ran some other games, and they were terrible.

A reboot fixed everything. The refactor had worked just fine.

I’ll close with a small observation, to keep handy in special moment of confusion. (For example, the day before shipping when half the subsystems suddenly fail, and the test server breaks, and and and…)

The inexplicable usually involves a coincidence.

Testing & Debugging Miscellanea

Here’s a quick peek into some of the test and debug strategae I’m using…

Base Object

One of the great hazards of C++ is object leaks. To mitigate this, I have a base object that everyone descends from, called MeObjectBase. My descendant object constructors all look like

MeRenderWorld::MeRenderWorld(int width, int height) : MeObjectBase("MeRenderWorld")
{ ... }

Then we can print a tally of current objects by type name. But all that text-dictionary lookup can be expensive! So there’s a global boolean to enable or disable count-by-name. Even when disabled, the total number of objects is tracked.


Like object leaks, memory leaks are alway looming. The main code never calls malloc or free directly. Instead, some wrappers are used which keep a tally of how much memory has been allocated and freed, and how many pointers have been allocated and freed. A few bytes at the beginning of the block hold the size. These static methods on MeObjectBase.

class MeObjectBase
    static void *malloc(size_t bytes);
    static void reallocAt(void **ptr, size_t bytes);
    static void freeAt(void **ptr);

Unit Testing

I’ve been coding since I was 12. That’s forty years now, but who’s counting. The only significant thing I’ve learned in the last twenty or so is the joy and beauty of organized unit tests. I used to do unit tests, without realizing it: I’d write some code at the top of main() to call the new function, print out the result, and quit. Then I’d delete the test and continue development.

I’ve also often written little test apps alongside my “real app” to exercise libraries.

Anyway, yeah, unit testing. For Metareal, I’m just running a small command line app that tallies up trues and falses and prints the result at the end.

Why not use an existing C++ unit testing framework? No great reasons, but among them: Tends to run slower as files are scanned or preprocessed for tests, adds code I don’t know that well. On the other hand, testing frameworks typically let you run single tests if needed. And I have to explicitly add every test function to main(). Oh well, is a tradeoff.

Here’s what some tests look like:

// A typical "main method" for a test file
void allMatrixTests()

// A typical assertion
    float m03 = m4(0,3);
    ASSERT_EQUALS_FLOAT("col/row", 3, m03);

// Implementation of one of the assertions
#define ASSERT_EQUALS_INT(_message,_want,_got) assertEqualsInt(1,__LINE__,_message,#_want,#_got,(long long)_want,(long long)_got)
void assertEqualsInt(int sayIt,int line,cc message,cc wantS,cc gotS,long long want,long long got)
    if(got == want)
        assertLogFail("%6d. FAIL %4d %s %s(%lld) != %s(%d)\n",g.fails,line,message,wantS,want,gotS,got);

The last assertion I make is that all the memory and objects have been freed. Here’s a happy output.

    currentCount:   0
       copyCount:   0
    unfreedCount:   0
    unfreedBytes:   0
         BitmapThing:   0    2    3
                Ham1:   0    1    1
             IfThing:   0    2    6
            IxMover2:   0    2    8
   MeCollisionVolume:   0   10  134
              MeGaud:   0    3   34
          MeGeometry:   0 1001 1541
               MeHam:   0    1    1
          MeITexture:   0    4   81
               MeLru:   0    2    3
              MePart:   0 1000 1190
       MeRenderParts:   0    3   59
       MeRenderWorld:   0    1   19
      MeTextureAtlas:   0    2   23
     MeTextureFloat4:   0    3   22
         MeThingKind:   0    1    8
      MeThingManager:   0    1    4
            MeVolume:   0  128  493
       MeVolumeWorld:   0    1   25
     MockFrameBuffer:   0    2   10
        MockMaterial:   0    3   67
        MockRenderer:   0    3   28
                   a:   0    3    3
                   b:   0    1    1
             unknown:   0    3   37
     0.  ok       undisposed objects 0(0) == MeObjectBase::currentCount(0)
     0.  ok       undisposed mallocs 0(0) == MeObjectBase::mallocCount - MeObjectBase::freeCount(0)
test results: 7551 pass / 7551 assertions (0.686 seconds)
Program ended with exit code: 0

Thing to notice: the total runtime for these tests is about a second. It covers the math, part and triangle-list management, many, many collision and volume intersection cases. It does not cover actual, live OpenGL code.

Debug Logging

Little to say here, I have a couple of log methods, which take arguments like printf. Listeners can be added.

Debug Global Booleans

Aaah, yes, debugging realtime code can be tricky. To help, when running the realtime app I map control-0 through control-9 directly to ten globally accessible booleans. Sometimes I’ll add some code to a deep, inner function which does something special based on those booleans. Then I can trigger it at will while running. Works nicely with breakpoints.

Well, Ok

That’s just some of the goodies in play.

Performance Check

I’ve been watching the CPU continuously during development. The Metareal World has many moving parts, and I want them all to keep moving, all the time.

Here’s 15000 cubes all orbiting a black hole. They do not affect each other.


That took 80% CPU on a modern Mac Pro desktop. Most of that time is uploading the 15000 new position in a texture for use by the vertex shader.

Here’s some other measurements so far:

  • Animate 15000 cubes: 80% CPU
  • Draw 1000000 triangles (500000 triangles in the scene, from two angles): 60 FPS
  • Draw more than 1000000 triangles: FPS drops fast!
  • Draw rates on Mac Pro Retina Notebook 2013: about the same
  • Draw rates on an iMac 2010: about 30 FPS… choppy… sad.
  • Collision resolution among 47000 AABBs, 900 of which are moving: 10% CPU
  • Collision resolution same, on Mac Pro Retina Notebook 2013: 20% CPU

Looks like my work is cut out: need to draw as few triangles as possible!

Fortunately, I have a robust system for managing volume intersections. It is the foundation for the collision detection and resolution. It will also be used to detect various kinds of triggers, such as passing through a zone to activate a thing, and also for placing a key in a slot. This volume manager should be quite usable as a culling system, as well.

The render management presently supports add and delete parts; it could be extended with show and hide.

The policy could be based on visibility zones established for which rooms can see each other. Or it could be coarser solution, such as a general radius from the camera, extending generally across the sight line. A frustum could be approximated with several AABBs, and a maximum view distance could be imposed, as if the world is murky. (This might even work well with some of the remote-camera-based puzzles…)

More to come.

AABB Collision Resolution, Phase I

The Metareal World is very, very, rectangular.


The classic world is an 8×8 grid of rooms, each room described by a 12×12 grid of characters. (Each room was checkered with two characters and a color scheme.) Your probe could only move exactly up, down, left, right. Most objects also moved only cardinally. (1)

I took a wonderfully simple shortcut for collision detection, which ran fine on a 1 MHz 6502, with maybe a dozen game objects on the 280 x 192 pixel screen at any time. It was this. When drawing each object to the screen, perform a logical AND of each byte it with the pixels underneath it, then XOR it to the screen. After the fact, if any of the ANDs yielded a nonzero result, we know the object has collided with something. The usual response was to simply back it up to the previous position, and maybe reverse its direction if it’s a “bouncer”.

It was visually perfect: if it looks like they collide, they collide. (2)

The modern world, however,


has different rules. The world is still grid-like, but may deviate in places, for dramatic effect. Your probe can move in any direction. Even, like, you know, diagonals. As clever as “pixel collision detection” may have been in 1984, it’s not the right approach any more.

However, the walls shall all be rectangular and planar, and plane-aligned to one of XY, YZ, or ZX. And although objects may be oddly shaped, I’m comfortable treating them, for collision and detection purposes, generally as boxes as well.

In game programming, this kind of geometry is called “axis-aligned bounding-boxes”, or AABB.

The general flow for each frame is:

  • Objects may move or teleport
  • Collisions are assessed
  • Corrections are made to conform to the world rules, like objects can’t move through walls so they need to stop at the wall.

First Problems

Iterating with the design and implementation revealed a few gotchas and false starts.

Problem: Sudden snapping looks bad!

A tempting approach is to look for intersections, and find the minimal move to “fix” the intersection.


But visually, a sudden correction of your line of travel makes no sense. It feels clunky

Solution: Correction must be to move less, not more

To remain visually pleasant and plausible, a correction must only work against the movement. It mustn’t move further than the original movement, or in a different direction. (3)


Problem: Snagging on Seams


As an object moves along a wall, pressing downward, at some point it might seem like the “minimal correction” is to stop the forward motion. At first think, doing the smallest possible correction seems best… But if we handle W2 first, then we “snag” on the wall seam.

Solution: Handle “necessary” corrections first

If we look at each intersection, we can see that on W1, only one correction is possible: upward. W2, on the other hand, has a choice between slightly left, or very up. Since W1 has no choice, we commit to the upward motion. Then we handle W2, which turns out to need no further correction. This also handles the case of corners:


Where each of W1 and W3 have only one viable correction. W2 had a choice, and in effect uses both of them.

Non-Problem: Bullet Penetration

In the Metareal World, things don’t move fast. In some game worlds, we care deeply about projectiles and vehicles passing rapidly through walls. “It can’t happen here.”


It turns out in this world, conveniently, we don’t need to worry about it. Objects can only move a fraction of their own size per frame. (The corner example above depends on this, by the way, to not squirt “between” W1 and W3, leaving only W2 in collision.)

Implementation Details

First thing to note: Metareal does this in 3d, not 2d. Everything is as shown above in the pictures, but in three axes.

The implementation is as follows (4):

  • (This is done for each moving object.)
  • The movement is in some direction, (Mx, My, Mz).
  • For each intersection, describe a correction (Cx, Cy, Cz) that would move it out of the intersection.
    • The correction shows 1 or more possible axes of correction. (-1, 2, 0) would mean that x-1 or y+2 would resolve the intersection.
    • The correction is in the opposite direction from the movement. We never resolve a collision by moving even more than the original movement. This is the rule that prevents lurchy snappings.
  • We go through the list of corrections. Those with a single axis of correction, like (0, 2, 0), are applied. It means there’s no alternatives to that shove, to get out of collision.
  • These mandatory shoves are subtracted from the remaining possible collisions, all of which involve a choice between several possible axes. Thus modified, the smallest axis of correction for each is chosen.

If only the object is moving, and the walls are not, then the shove directions from all collisions have the same signs — in particular, opposite of the movement.

Also, if the object started out not in collision, then the greatest possible correction is simply to undo the movement. This would happen when trying to move into a corner.

Futures and Glossovers

There are still things this algorithm handles poorly.

  • Each object has a priority. Walls have a higher priority than objects, for example. This is taken into account to decide which thing gets shoved.
  • Transitive shoving is not well-handled yet. Ideally, this should be resolved within each frame, not rippling forward in time. Also, shove priority should be transitive as well. For example, normally a P3 object cannot push a P2 object. But if the P3 object is impelled by a P1 object into the P2 object, then, backed by the higher priority (5) force, it should successfully move the P2 object as a result.
  • The actual implementation treats the movement vector as the difference between the object’s movement and the “wall’s” movement (which might not really be a wall). If these are from different directions, the corrections may be contradictory; resolution may be impossible. For now, I average them and let the object get squished between them. But it is not satisfying.
  • In something resembling the “relativistic sliding hole” paradox, a sliding object can slip into gap that is smaller then itself. Then it’s once again in the “pushed from both sides” state, which doesn’t work well.

Some of these can be mostly avoided with careful world design. Transitive shoving will be implemented as part of Phase II; it will be needed.


1. Last night I had a dream involving a post-apocalyptic Oakland (comma, California), vast concrete parking garage-like spaces. The bedraggled, homeless denizens clustered and sat here and there, moving metal scraps and junk around on crude, chalk-drawn boards. They were in some way psychically addicted to playing chess. Anyway, the term “cardinal directions.” I wonder if Rooks ought more properly to be called Cardinals.

2. Except that on the Apple II green and purple don’t collide, they occupy alternate bits. So objects were carefully designed to have white parts at collision-ey points.

3. Sometimes an object uses “teleporting”, where it moves a great distance without expecting smooth visual continuity. In that case, it’s ok to lurch, and indeed we can just take the minimal correction in any direction to enforce the world’s physics.

4. This answer on suggests to simply resolve all your X-penetrations first, and then Y-penetrations. But I’m not convinced.

5. Since I come from an embedded hardware background, I’m accustomed to explaining that interrupts with lower numbers have higher priorities. I carry this convention forward.

Editing, Bitmaps, and Mesh

Added a tremendously useful parameter type to the editor: 1-bit-deep bitmap type. The editor is clunky, by traditional standards, but as ever, fine for me.

One use for a bitmap parameter on an object is to procedurally generate its geometry. Allows a wide variety of expression without importing 3d assets. Since part of the “look” of the product will be “code, code, code”, this works well. The relatedness of the forms conveys the fact nicely.

Generating mesh from bitmap was fun. I used a greedy algorithm to find rectangles from top left to bottom right, knocking out bits once found. Tracing the extrusion edges was trickier. It works by:

  • Building a list of every unit segment which is an edge (0-1 boundaries)
  • Take one, and link to the next & the next, until the star point is re-found
  • Make the sides longer than unit segments: any three matching X’s or Y’s in a row, remove the middle vertex. Repeat.
  • For each loop, determine if it is clockwise or counterclockwise (4 more lefts than right, or four more rights than left
  • Determine if it’s contains bits or spaces — find the leftmost lowest corner, and see if it surrounds a bit or a space
  • If necessary, reverse the loop (bits clockwise, spaces counterclockwise) so the faces face outward from the final geometry.

Also, beginnings of mapping these objects to Volume Collision Objects. The world begins to operate.