I learned Applesoft BASIC to draw a surprise Dickbutt.
If we’re counting machine code, I learned 6502 ASM for faster division on NES, because it was half the CPU time on my first-person shooter. After many iterations pushing it down to mere hundreds of cycles, I slapped my forehead and implemented log tables in like 512 bytes and 45 cycles. It’s negligible now. And supports constant fractional scaling. And has overflow / underflow saturation. Really, 6502 ASM is fantastic to fuck around in, even though the rest of the NES’s hardware suuucks.
I mostly used this page to know what’s possible, and occasionally reinvented the wheel. Conditional jumps are still arcane and fragile in my hands. But I benchmarked all kinds of sequential memory access patterns before realizing the 6502 does not give a dang about reusing the same address.
On Z80, you want to load two registers, use them as a pointer, and tweak the low byte. The 6502 can just take an address and an offset in four cycles. So if you want to access $3000 as an array and read index 4, 5, 6, 7, you don’t LDX 4 and INC X, you LDX 4 and then LDA $3000,X, LDA $3001,X, LDA $3002,X, LDA $3004,X. For e.g. controller reads, you can hardcode bare addresses and it’s twice as fast.
I learned Applesoft BASIC to draw a surprise Dickbutt.
If we’re counting machine code, I learned 6502 ASM for faster division on NES, because it was half the CPU time on my first-person shooter. After many iterations pushing it down to mere hundreds of cycles, I slapped my forehead and implemented log tables in like 512 bytes and 45 cycles. It’s negligible now. And supports constant fractional scaling. And has overflow / underflow saturation. Really, 6502 ASM is fantastic to fuck around in, even though the rest of the NES’s hardware suuucks.
from googling I see wikipedia has a book for it: https://en.wikibooks.org/wiki/6502_Assembly
I mostly used this page to know what’s possible, and occasionally reinvented the wheel. Conditional jumps are still arcane and fragile in my hands. But I benchmarked all kinds of sequential memory access patterns before realizing the 6502 does not give a dang about reusing the same address.
On Z80, you want to load two registers, use them as a pointer, and tweak the low byte. The 6502 can just take an address and an offset in four cycles. So if you want to access $3000 as an array and read index 4, 5, 6, 7, you don’t LDX 4 and INC X, you LDX 4 and then LDA $3000,X, LDA $3001,X, LDA $3002,X, LDA $3004,X. For e.g. controller reads, you can hardcode bare addresses and it’s twice as fast.