JacksonDunstan.com | Optimizing with Bit Twiddling Hacks

Optimizing with Bit Twiddling Hacks

November 26, 2012 Tags: bitwise, performance

Stanford’s Sean Eron Anderson has a great page titled Bit Twiddling Hacks. This is a great resource for C programmers looking to employ bitwise operators (& | ^ ~ << <<< >> >>> etc.) to speed up their code. But what about AS3 programmers? We have bitwise operators too, but will the same tricks work for us? Today’s article ports some of these bit twiddling hacks to AS3 and tests to see if any ground is gained.

A few months ago I tested just one of these bit twiddling hacks and found that Math.abs could be sped up by 20% for integers. Today I’ll test several more of the hacks:

Sign of integer
Detect different signs
Minimum integer (like Math.min)
Maximum integer (like Math.max)
Conditionally set or clear bits
Conditionally negate a value without branching
Count bits set

Here’s the test app that does the testing:

package
{
	import flash.utils.getTimer;
	import flash.text.TextFieldAutoSize;
	import flash.text.TextField;
	import flash.display.Sprite;
 
	public class BitTwiddlingHacks extends Sprite
	{
		private static const BITS_SET_TABLE_256:Vector.<int> = new <int>[0, 0 +1, 0 +1, 0 +2, 0 +1, 0 +1 +1, 0 +1 +1, 0 +1 +2, 0 +1, 0 +1 +1, 0 +1 +1, 0 +1 +2, 0 +2, 0 +2 +1, 0 +2 +1, 0 +2 +2, 0 +1, 0 +1 +1, 0 +1 +1, 0 +1 +2, 0 +1 +1, 0 +1 +1 +1, 0 +1 +1 +1, 0 +1 +1 +2, 0 +1 +1, 0 +1 +1 +1, 0 +1 +1 +1, 0 +1 +1 +2, 0 +1 +2, 0 +1 +2 +1, 0 +1 +2 +1, 0 +1 +2 +2, 0 +1, 0 +1 +1, 0 +1 +1, 0 +1 +2, 0 +1 +1, 0 +1 +1 +1, 0 +1 +1 +1, 0 +1 +1 +2, 0 +1 +1, 0 +1 +1 +1, 0 +1 +1 +1, 0 +1 +1 +2, 0 +1 +2, 0 +1 +2 +1, 0 +1 +2 +1, 0 +1 +2 +2, 0 +2, 0 +2 +1, 0 +2 +1, 0 +2 +2, 0 +2 +1, 0 +2 +1 +1, 0 +2 +1 +1, 0 +2 +1 +2, 0 +2 +1, 0 +2 +1 +1, 0 +2 +1 +1, 0 +2 +1 +2, 0 +2 +2, 0 +2 +2 +1, 0 +2 +2 +1, 0 +2 +2 +2, 1, 1 +1, 1 +1, 1 +2, 1 +1, 1 +1 +1, 1 +1 +1, 1 +1 +2, 1 +1, 1 +1 +1, 1 +1 +1, 1 +1 +2, 1 +2, 1 +2 +1, 1 +2 +1, 1 +2 +2, 1 +1, 1 +1 +1, 1 +1 +1, 1 +1 +2, 1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +2, 1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +2, 1 +1 +2, 1 +1 +2 +1, 1 +1 +2 +1, 1 +1 +2 +2, 1 +1, 1 +1 +1, 1 +1 +1, 1 +1 +2, 1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +2, 1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +2, 1 +1 +2, 1 +1 +2 +1, 1 +1 +2 +1, 1 +1 +2 +2, 1 +2, 1 +2 +1, 1 +2 +1, 1 +2 +2, 1 +2 +1, 1 +2 +1 +1, 1 +2 +1 +1, 1 +2 +1 +2, 1 +2 +1, 1 +2 +1 +1, 1 +2 +1 +1, 1 +2 +1 +2, 1 +2 +2, 1 +2 +2 +1, 1 +2 +2 +1, 1 +2 +2 +2, 1, 1 +1, 1 +1, 1 +2, 1 +1, 1 +1 +1, 1 +1 +1, 1 +1 +2, 1 +1, 1 +1 +1, 1 +1 +1, 1 +1 +2, 1 +2, 1 +2 +1, 1 +2 +1, 1 +2 +2, 1 +1, 1 +1 +1, 1 +1 +1, 1 +1 +2, 1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +2, 1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +2, 1 +1 +2, 1 +1 +2 +1, 1 +1 +2 +1, 1 +1 +2 +2, 1 +1, 1 +1 +1, 1 +1 +1, 1 +1 +2, 1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +2, 1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +1, 1 +1 +1 +2, 1 +1 +2, 1 +1 +2 +1, 1 +1 +2 +1, 1 +1 +2 +2, 1 +2, 1 +2 +1, 1 +2 +1, 1 +2 +2, 1 +2 +1, 1 +2 +1 +1, 1 +2 +1 +1, 1 +2 +1 +2, 1 +2 +1, 1 +2 +1 +1, 1 +2 +1 +1, 1 +2 +1 +2, 1 +2 +2, 1 +2 +2 +1, 1 +2 +2 +1, 1 +2 +2 +2, 2, 2 +1, 2 +1, 2 +2, 2 +1, 2 +1 +1, 2 +1 +1, 2 +1 +2, 2 +1, 2 +1 +1, 2 +1 +1, 2 +1 +2, 2 +2, 2 +2 +1, 2 +2 +1, 2 +2 +2, 2 +1, 2 +1 +1, 2 +1 +1, 2 +1 +2, 2 +1 +1, 2 +1 +1 +1, 2 +1 +1 +1, 2 +1 +1 +2, 2 +1 +1, 2 +1 +1 +1, 2 +1 +1 +1, 2 +1 +1 +2, 2 +1 +2, 2 +1 +2 +1, 2 +1 +2 +1, 2 +1 +2 +2, 2 +1, 2 +1 +1, 2 +1 +1, 2 +1 +2, 2 +1 +1, 2 +1 +1 +1, 2 +1 +1 +1, 2 +1 +1 +2, 2 +1 +1, 2 +1 +1 +1, 2 +1 +1 +1, 2 +1 +1 +2, 2 +1 +2, 2 +1 +2 +1, 2 +1 +2 +1, 2 +1 +2 +2, 2 +2, 2 +2 +1, 2 +2 +1, 2 +2 +2, 2 +2 +1, 2 +2 +1 +1, 2 +2 +1 +1, 2 +2 +1 +2, 2 +2 +1, 2 +2 +1 +1, 2 +2 +1 +1, 2 +2 +1 +2, 2 +2 +2, 2 +2 +2 +1, 2 +2 +2 +1, 2 +2 +2 +2];
 
		private var logger:TextField = new TextField();
		private function row(...cols): void { logger.appendText(cols.join(",")+"\n"); }
 
		public function BitTwiddlingHacks()
		{
			logger.autoSize = TextFieldAutoSize.LEFT;
			addChild(logger);
 
			init();
		}
 
		private function init(): void
		{
			var beforeTime:int;
			var afterTime:int;
			var i:int;
			var REPS:int = 100000000;
			var normalTime:int;
			var twiddlingTime:int;
			var input:int;
			var input2:int;
			var inputBool:Boolean;
			var result:int;
			var resultBool:Boolean;
			var BitsSetTable256:Vector.<int> = BITS_SET_TABLE_256;
 
			row("Function,Normal,Twiddling");
 
			//////////////////////////////////
			// Sign of integer (when positive)
			//////////////////////////////////
 
			input = 10;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = input < 0 ? -1 : 0;
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = -(input>>31);
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Sign of integer (when positive)", normalTime, twiddlingTime);
 
			//////////////////////////////////
			// Sign of integer (when negative)
			//////////////////////////////////
 
			input = -10;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = input < 0 ? -1 : 0;
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = -(input>>31);
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Sign of integer (when negative)", normalTime, twiddlingTime);
 
			/////////////////////////
			// Detect different signs
			/////////////////////////
 
			input = 10;
			input2 = 30;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				resultBool = (input < 0 && input2 < 0) || (input >= 0 && input2 >= 0);
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				resultBool = ((input ^ input2) < 0);
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Detect different signs", normalTime, twiddlingTime);
 
			//////////////////
			// Minimum integer
			//////////////////
 
			input = 10;
			input2 = 30;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = input < input2 ? input : input2;
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = input2 ^ ((input ^ input2) & -(input < input2 ? 1 : 0));
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Minimum integer", normalTime, twiddlingTime);
 
			//////////////////
			// Maximum integer
			//////////////////
 
			input = 10;
			input2 = 30;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = input > input2 ? input : input2;
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = input ^ ((input ^ input2) & -(input < input2 ? 1 : 0));
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Maximum integer", normalTime, twiddlingTime);
 
			/////////////////////////////////////////
			// Conditionally set or clear bits (true)
			/////////////////////////////////////////
 
			input = 0; // to modify
			input2 = 6; // mask
			inputBool = true; // set bits
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = inputBool ? (input | input2) : (input & ~input2);
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = (input & ~input2) | (-(inputBool ? 1 : 0) & input2);
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Conditionally set or clear bits (true)", normalTime, twiddlingTime);
 
			//////////////////////////////////////////
			// Conditionally set or clear bits (false)
			//////////////////////////////////////////
 
			input = 0; // to modify
			input2 = 6; // mask
			inputBool = false; // clear bits
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = inputBool ? (input | input2) : (input & ~input2);
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = (input & ~input2) | (-(inputBool ? 1 : 0) & input2);
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Conditionally set or clear bits (false)", normalTime, twiddlingTime);
 
			////////////////////////////////////////////////////////
			// Conditionally negate a value without branching (true)
			////////////////////////////////////////////////////////
 
			input = 30;
			inputBool = false; // do negate (this flag is to NOT negate)
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = inputBool ? -input : input;
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				input2 = inputBool ? 1 : 0; // temp
				result = (input2 ^ (input2 - 1)) * input;
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Conditionally negate a value without branching (true)", normalTime, twiddlingTime);
 
			/////////////////////////////////////////////////////////
			// Conditionally negate a value without branching (false)
			/////////////////////////////////////////////////////////
 
			input = 30;
			inputBool = true; // don't negate (this flag is to NOT negate)
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = inputBool ? -input : input;
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				input2 = inputBool ? 1 : 0; // temp
				result = (input2 ^ (input2 - 1)) * input;
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Conditionally negate a value without branching (false)", normalTime, twiddlingTime);
 
			/////////////////
			// Count bits set
			/////////////////
 
			input = 30;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				input2 = input; // temp
				for (result = 0; input2; input2 >>= 1)
				{
					result += input2 & 1;
				}
			}
			afterTime = getTimer();
			normalTime = afterTime - beforeTime;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				result = BitsSetTable256[input & 0xff] + 
					BitsSetTable256[(input >> 8) & 0xff] + 
					BitsSetTable256[(input >> 16) & 0xff] + 
					BitsSetTable256[input >> 24];
			}
			afterTime = getTimer();
			twiddlingTime = afterTime - beforeTime;
 
			row("Count bits set", normalTime, twiddlingTime);
		}
	}
}

I ran this test app in the following environment:

Flex SDK (MXMLC) 4.6.0.23201, compiling in release mode (no debugging or verbose stack traces)
Release version of Flash Player 11.5.31.2
2.3 Ghz Intel Core i7
Mac OS X 10.8.2

And here are the results I got:

Function	Normal	Twiddling
Sign of integer (when positive)	220	220
Sign of integer (when negative)	217	216
Detect different signs	261	164
Minimum integer	216	370
Maximum integer	156	372
Conditionally set or clear bits (true)	194	402
Conditionally set or clear bits (false)	205	402
Conditionally negate a value without branching (true)	184	186
Conditionally negate a value without branching (false)	282	216
Count bits set	957	586

Performance Graph

There’s definitely a mixed bag of performance here. Sometimes the bit twiddling version is two times slower, sometimes it’s just as fast, sometimes it’s a little faster, and sometimes it’s two times faster. While always fancier, bitwise methods are not always faster. Thankfully, the above performance results should serve as a handy reference guide for the next time you’re thinking of using some bitwise tricks. You could double your performance… or cut it in half.

Spot a bug? Have a suggestion? Post a comment!

Comments

#1 by orion elenzil on November 27th, 2012 · Reply

some of the cases where the bit-twiddling conversion is slower are hardly surprising.

eg, which of these will be faster ?

result = input < input2 ? input : input2; result = input2 ^ ((input ^ input2) & -(input < input2 ? 1 : 0));

the latter is certainly doing some bit-twiddling, but it’s doing it in addition to the exact same ternary operation the former is doing.

one of the main points of doing bit hacks is to avoid branching, and the ternary doesn’t really count as avoiding branching.
for example, here’s an example implemented once with traditional if/else and once with ternary:

public function bitTest() { var a:Number; var x:Number = 3; var y:Number = 4; // ternary: // a = (x < y ? 1 : 0); // if_else: // if (x < y) { // a = 1; // } // else { // a = 0; // } }

and here is the resulting release-build .swf for each decompiled:
(decompiled using swfdump -a)

ternary:
00000) + 0:0 getlocal_0
00001) + 1:0 pushscope
00002) + 0:1 pushnan
00003) + 1:1 setlocal_1
00004) + 0:1 getlocal_0
00005) + 1:1 constructsuper 0 params
00006) + 0:1 pushbyte 3
00007) + 1:1 convert_d
00008) + 1:1 setlocal_2
00009) + 0:1 pushbyte 4
00010) + 1:1 convert_d
00011) + 1:1 setlocal_3
00012) + 0:1 getlocal_2
00013) + 1:1 getlocal_3
00014) + 2:1 lessthan
00015) + 1:1 iffalse ->19 21
00019) + 0:1 pushbyte 0
00020) + 1:1 convert_d
00021) + 1:1 convert_d
00022) + 1:1 setlocal_1
00023) + 0:1 returnvoid

if/else:
00000) + 0:0 getlocal_0
00001) + 1:0 pushscope
00002) + 0:1 pushnan
00003) + 1:1 setlocal_1
00004) + 0:1 getlocal_0
00005) + 1:1 constructsuper 0 params
00006) + 0:1 pushbyte 3
00007) + 1:1 convert_d
00008) + 1:1 setlocal_2
00009) + 0:1 pushbyte 4
00010) + 1:1 convert_d
00011) + 1:1 setlocal_3
00012) + 0:1 getlocal_2
00013) + 1:1 getlocal_3
00014) + 2:1 ifnlt ->19 22
00019) + 0:1 pushbyte 0
00020) + 1:1 convert_d
00021) + 1:1 setlocal_1
00022) + 0:1 returnvoid

i wonder if there’s a different branch-prediction success rate for “ifnlt” and “iffalse”.

in the “count bits set” test,
it might be interesting to use a value larger than 30.
that is, the speed of the twiddling method should be a constant no matter what the input value is,
while the speed of the traditional method should be proportional to the position of the highest bit set.
so eg, for 30 (0b11110), the number of iterations will be 5,
while for 0x800000, the number of iterations will be 32.
- #2 by jackson on November 27th, 2012 · Reply
  
  Very good points. In porting some of the hacks it was necessary to make some of them slower due to differences between C and AS3. One major problem is, as you point out, that a Boolean can’t be simply treated as an int or uint. This led to the bool ? 1 : 0 expression you see a lot. Regardless, the point was to show that not all of the hacks result in better performance when ported to AS3, but I could have been a little clearer as to why.
  
  As for the “count bits set” test, that is definitely something to try out. For simplicity’s sake I just had one test for the results, but a fuller range would yield some more clarity. Using 0xFFFFFFFF as an input would certainly take long with the normal method and widen the gap so that the bitwise version is even faster.
  - #3 by orion elenzil on November 27th, 2012 · Reply
    
    roger.
    
    i think text-formatting stripped out a comment from the decompiled versions,
    just noting that the decompile of the ternary operator includes a conditional branch at index 15.
  - #4 by skyboy on January 1st, 2013 · Reply
    
    bool ? 1 : 0 isn’t nessicary: int(bool) will do the same thing, but is much faster since the JIT sweeps it up and converts it to a single opcode: convert_i. It is slightly more expensive than in C. pseudocode:
    
    switch (atom & 7) { // ... case 5: // boolean's tag is 5: 101 in binary (int is 6: 110) http://hg.mozilla.org/tamarin-redux/file/dd76690012c5/core/atom.h#l76 return (atom & ~7) ^ 6; // ~7 is optimized by the compiler: this code is for C++ and is the core of AVM2. The compiler would probably even optimize this statement into @atom + 1@ // ... }
    
    I’d cite what the AVM2 actually does, but i don’t really want to look through the dozens of files and thousands of lines of code to find it. :P
    However, the end result is that int(boolean) is very nearly free.
    - #5 by jackson on January 1st, 2013 · Reply
      
      Great tip!
      - #6 by benjamin guihaire on June 21st, 2013 · Reply
        
        can you redo the profiling with int(bool) instead ?
        
        also , in resultBool = ((input ^ input2) >31
        
        in this one:
        result = input2 ^ ((input ^ input2) & -(input < input2 ? 1 : 0));
        the part doing this: -(input >31) ?
        
        #7 by jackson on June 21st, 2013 · Reply
        
        Perhaps in an update article I’ll check with int(bool) instead of bool ? 1 : 0.
        
        I’m not sure what your question is about -(input > 31). Could you clarify?
        
        #8 by benjamin guihaire on June 22nd, 2013 · Reply
        
        Sorry, I was not clear, my proposal is to instead of doing this:
        result = input2 ^ ((input ^ input2) & -(input < input2 ? 1 : 0));
        do that:
        result = input2 ^ ((input ^ input2) & -((input-input2)<<31);
        to avoid the ternary operator.
        ((input-input2) will be negative when input<input2, so the bit sign will be set..)
        
        #9 by jackson on June 22nd, 2013 · Reply
        
        No problem. Let’s consider an example:
        
        var input:int = 1; var input2:int = 2; var input3:int = (input - input2) << 31; trace(input3); // output: -2147483648
        
        But the original version gives either 1 or 0 as an output, so they’re not equivalent.
        
        Perhaps I’m missing something. Care to elaborate how the <<31 version works?
        
        #10 by benjamin guihaire on August 6th, 2013 · Reply
        
        okay … one more time (can you delete the extra comments?)
        It looks like I need to type the comments with & lt ; to get it to show properly:
        
        result = input2 ^ ((input ^ input2) & ((input-input2)>>31));
        
        in this test, if input<input2 , input-input2 is negative, and a negative number with >>31 should be -1, which was the test with -(input < input2 ? 1 : 0)); but my version should be faster than the ternary operator.
#11 by crussell on April 7th, 2013 · Reply

I would be really wary of trying to prove that the bit twiddling hack performs worse when you are using inputs that remain constant over the repetitions that your are timing. You need to realize the power of a quickly saturating local branch predictor in these tests. On a superscalar out-of-order architecture such as that of the i7 you are using, the price of branch mis-predicts can be huge. You should try this experiment again using a randomized input on each iteration of the loops you are exercising.
- #12 by jackson on April 7th, 2013 · Reply
  
  Good point about branch prediction. I’d like to use more varied inputs, but it’s difficult in a test like this. The cost of a Math.random call (or equivalent) would be quite large compared to the actual work being tested. For example, if the Math.random call takes 100 nanoseconds, the bit twiddling hack takes 50 nanoseconds, and the normal version takes 75 nanoseconds, you get results of 150 and 175. The relative order is preserved but the difference looks much smaller than it actually is. I’ve tried to measure the Math.random call by itself and subtract it out in the past, but that usually has very poor results with regards to consistency. There are a lot of negative values that come up when using such a technique and I generally don’t think the technique yields high-quality result sets.
  
  I am totally open to suggestions though. Please let me know if you’ve got a good way to add some variation to the test inputs without skewing the results. I would happily incorporate such a technique into my testing approach.
  - #13 by crussell on April 8th, 2013 · Reply
    
    A solution might be to populate a large array up front with the random values. Try to make it not too large so that the majority of it stays in the cache. Since there is really no global correlation with the branches you are testing, all you need to do is make sure that the randomization array is large enough to negate any pattern correlation with the local prediction. I would start with an array of size 2^10 or so and make your REPS a multiple of that. Just index your array as rand_array[i & 0x3FF]. Hopefully this will get rid of a lot of the overhead of calling Math.random directly during the timing portion of your code.
    - #14 by jackson on April 9th, 2013 · Reply
      
      That’s a good idea for reducing the amount of randomizing per loop. Unfortunately, even just a Vector indexing takes quite a while. I added a fake test that just indexes BitsSetTable256 and it takes about 220 milliseconds on the same test. That result is actually larger than half the results in the test, so the level of skewing would be quite extreme were I to add a Vector index to each test: results would approximately double and hide lots of details as described above. Your idea is sound, but I’m still looking for a way to implement it without skewing results. If you (or anyone!) have any more ideas, I’m all ears.