* Add WIP support for Vertex Program A, add the FADD_I32 shader instruction, small fix on FFMA_I encoding, nits
* Add separate subroutines for program A/B, and copy attributes to a temp
* Move finalization code to main
* Add new line after flip uniform on the shader
* Handle possible case where VPB uses an output attribute written by VPA but not available on the vbo
* Address PR feedback