The first thing I'd look at is having 1 bit to determine if the block is "fully solid" (hull, weapon computer, etc) or not (glass, flowers, wedges, etc). This is important for faster rendering (hidden surface removal - if 2 adjacent blocks are both "fully solid" then both block's adjoining faces can be skipped).
For the solid blocks, you can have 3 bits for orientation and 1 bit for activation status, leaving 11 bits for solid block IDs. That's 2048 solid blocks.
For the non-solids, you'd only need an activation bit for rod lights, and it'd make more sense just having different IDs instead (e.g. one ID for "white rod light that's off" and another ID for "white rod light that's on", etc). If you use 6 bits for orientation, you're left with 9 bits for the block ID. That's 512 block IDs.
That gives a total of 2530 block IDs, plus activation and orientation. However, I've only used 16 bits, and I've "forgotten" HP.
The thing is, computers like "powers of 2" (especially 80x86, especially for array indexing where there's special support built right into the instruction set). Numbers like 2, 4 and 8 bytes? Lovely. Numbers like 3 bytes? Slow.
The other thing is cache locality. Cache misses are very expensive and the size of the CPU's caches are limited (especially the fastest L1 and L2 caches). You want to cram as much data you need into every cache line you fetch. When a missile hits a ship and you're calculating how much got damaged, you're primarily accessing the HP values and don't care too much what types of blocks they are (until/unless a block is destroyed); so you don't want cache lines containing "HP data" bloated up with the block IDs. When you're calculating power regen, or which weapon blocks are part of which group, you don't care about the HP at all; and don't want your "block ID data" bloated up with HP values. When you're spawning a ship from blueprint; don't store the HP values in the blueprint at all (and generate them when the ship is loaded) and reduce file size while also avoiding "brand new ship came with damage!" problems.
Finally; for most CPUs there's also SIMD ("Single Instruction Multiple Data") - e.g. SSE/AVX on 80x86, Neon on ARM, etc. SIMD can and does give a massive performance boost. To get the benefits you have to pack multiple pieces of data together; so that your "single instructions" can operate on the "multiple pieces of data". For something like StarMade (e.g. where a lot of processing involves arrays of block IDs, and where the data could easily be packed suitable for SIMD) it could easily give major performance improvements. SIMD does not work on "3 bytes per piece of data". Modern JVMs do optimise for SIMD, and StarMade is probably missing out on major performance improvements.
What I'm saying is that HP values should be a completely separate array. This array would be "1 byte per block", where HP ranges from 1 to 256. If HP is 0 then the block was destroyed or didn't exist in the first place.
Also note that for things like collision detection; you'd use the "1 byte per block" HP array and not the "2 bytes per block" block ID data. This should halve the cache misses (as there's half as much data for the CPU to fetch), and (maybe) get twice the benefits from SIMD.