Ok, something a bit weird for a change.
Many old books on computers start by teaching the principles of binary notation, and conversions between hexadecimal and decimal. This is usually shown with something resembling the 128*b7+64*b6+32*b5+16*b4+8*b3+4*b2+2*b1+b0 formula.

From An Introduction to Microcomputers, Volume 0: The Beginner's Book. Third edition. (Osborne & Bunnell, 1982) 
Yet if I ever worked with hex, I would never really grasp this mathematical underpinning of the system. If I wrote a snippet of C64 machine code, to me "$D020" would be the memorized "name" of where I put the border colour value, and I would not really give much thought to what it was in decimal. Similarly, I just knew that $C000 was 49152, without doing the conversion.
Also, when building user defined graphic characters on the Spectrum BASIC, I'd be lazy and use the BIN 00011000 style statements instead of converting them into the shorter decimal form. Even then, if the patterns were simple and repeated often, I'd use 255 for BIN 11111111 and 24 for BIN 00011000 because I "knew" these values by heart, not because I was good at calculating the decimal in my "head".

Yes, visually simple but a bit silly. 
Yeah, I can nowadays get along between small binary, decimal and hex values, but I could still do better. Although the modern programmer does not need to know that much about binary or hex, with retro computers and digital electronics, they are quite relevant.
From my previous experiences, I have begun to think, why would learning binary be different from learning guitar chords or another alphabet? People can memorize numerous guitar chords without having a throughandthrough understanding of the mathematics of the chord theory. Did the computer books go a bit wrong way about the whole binary thing, especially for beginners?
I think the guitar chords are a good example: Through perseverance, a number of chords can be memorized directly. Knowing three chords, something satisfying can already be played. Then you can add more chords to your repertoire as you learn more complex and more satisfying songs. But learning hits really when you see that somewhat similar chords can be grouped sensibly.
The Binary alphabet
With this in mind, I began pursuing a learning approach that would help me convert binary/hex/decimal. I can't vouch for any "system" here but perhaps this gives some ideas.

The "Binary Alphabet" 
I imagined each hex number to have an alternative 4bit binary "glyph" or notation that needs to be memorized. These are simply binary versions of the same numbers, yet visualized as blocks. I've abandoned the 0/1 notation, because we're not necessarily dealing with numbers, but HIGH/LOW, ON/OFF and TRUE/FALSE states, or indeed, bitmaps.

Most of us carry practice equipment with us. Just don't go showing a "4" around in public places. 
With eightbit numbers (0255 or 0x000xFF or $00$FF) the amount of "letters" in the alphabet would be too big to memorize. However, only 16 "glyphs" are really needed when moving between hex and binary, because each hex number corresponds directly to four binary numbers.
Each 8bit hecadecimal number is made out of two such components. Below is an example of a few of such combinations. Referring to the above alphabet, it can be easily seen how the hex values are built, without any "calculation" going on.
This would already form the basis for the notation of 8x8 characters, familiar from 1980s micro computers. Note that as the required pieces are repeated, there's a need for only a few distinct values. (Remember the threechord analogy...)
A lot of 8x8 graphics can be worked out with just $0,$3,$6,$9 and $F.
Alphabet groupings
Normal letters are grouped to wovels and consonants, musical chords can be sharp, flat, diminished and so on. Grouping can advance learning and analysis. Both the alphabet and chords are often teached from the simpler towards the more complex. Below, I have attempted to group the binary glyphs according to simplicity and order.
Looking at the table below, it should be
Obvious, that 0x0 is 0000 and 0xF is 1111. Also, the basic components, 1, 2, 4 and 8 ought to be
Simple to learn by heart.
But it should be apparent, that by moving the block right or left, the number becomes divided or multiplied by two. This holds true for all binary graphics that can be moved so that the figure still stays intact. This is really the basis of the grouping here. So, 0101 is five (0x5) in decimal, but 1010 is ten (0xA). The point here is that knowing one can lead to knowing the other, knowing 5 you will know 10 and vice versa.
There's only three "letters" that are not grouped in this way in the above system, 0x9, 0xB and 0xD, as the can't be shifted. (Neither can 0 or 15, but they are exempt from this difficulty anyway.)
There's another way to group the binary family that might be helpful. The image below limits the family to eight "root" glyphs, which is half the number of 16.
Each of these base values are +1, if the last bit is set. Each 4bit binary value that has the last bit 0, can be changed into a value+1 by setting the bit. This may seem a bit strange at this point, but consider the ungrouped values above. The 9 can be considered an "8" with +1 added, the "11" is "really" a "10" with 1 added, and "13" is a 12+1.
I only show this because it has some resemblance to the way guitar and piano chords can be varied by adjusting the last note. (Think the difference between the open guitar chord C and C7) This is more to demonstrate that there are a number of approaches by which to exercise your binary muscles without going too deep to the math side of things.
Another additional tool is to learn to rapidly invert the values and gain the result from a simple calculation. Meaning, each of the empty/white regions are considered black and vice versa. Looking at 1001, I can think of 0110 which is 6. at which point I have to think "156" which brings up 9, the decimal value of 1001. But this is already a bit cumbersome and introduces calculation in a way I hoped to avoid.
I think the sixteen shapes could simply be learned by heart. Most of the above might be dismissed as tricks that can be somewhat helpful in the process of learning. Yet they also have the added significance in that inverting and shifting are also common and useful operations in machine code and discrete logic. To "know" binary and hex is to "know" how they are to be manipulated, it is not simply a matter of converting them between each other.
Towards larger numbers?
The conversion between binary and hex should be fairly simple, even if the numbers are larger. So, 0xC020 is 1100 0000 0010 0000, and 0xB4CD becomes 1011 0100 1100 1101.
At this point it should be good to be able to instantly recognize the hexadecimal and the decimal value of any 4bit binary number, and the decimal values 015.
Indeed, what about decimal conversion? Here I'm on a bit shakier ground. For 8bit numbers (0255), one could learn a second "alphabet", for the higher part of the byte:
These are the previous values * 16. This value will have to be added with the lower part, and I'm uncertain if this is more handy system than simply adding all the binaries individually together just as the books say.
Besides, the route for converting decimal values back to binary is not as obvious. You would have to see that a value such as decimal 119 is above 112 yet below 128, and so would have the 0111 high portion.
The shifting rule of course always works, so 224 (1110 0000) shifted right is 112 (0111 0000) and vice versa.
For 16bit numbers, I'm afraid there would have to be one more of these alphabets. A 16bit number is High byte * 256 + Low byte, but here are only the highest 4bits of a 16bit value:
This may seem unreasonable to memorize, and the shifting rule becomes less useful. But if I'm targeting an understanding of a typical 8bit computer memory map, it becomes somewhat more meaningful. The memory can be visualized as 16 equalsized (4k) chunks. Available memory spaces and memory mapped portions in 8bit computers often start from such "even addresses", for example the VIC chip of the C64 starts at 0xD000, 0x1101 0000 0000 0000, 53248. The ZX Spectrum screen memory address (and RAM altogether) starts at 0x4000 = 0x0100 0000 0000 0000 = 16384. 0x8000 is the halfway point in 16bit memory space, 32768.
So, a 0xC000, 49152, is 0x1100 0000 0000 0000, and 53280 (The C64 border address) is 1101 0000 0010 0000, 0xD020.
Yet, the above is less helpful for deciphering a random 16bit value. I'd need yet another "alphabet" for the lower part of the high byte, which I just can't think is the road to go now.

Just an example. Not worth the trouble really. 
So, the system would end up being exactly the kind of additiongame I was hoping to avoid in the first place. The best bet might be to use a combination of memorization and the "book system" of counting the 32768+16384+8192+4096+2048+1024+512+256 values together.
But, umm, yes, anyways. At least the hexbin thingy is easy. And 8bit numbers are not that bad. I think I'll go back doing whatever I was doing.