* Implement ARM exclusive load/store with compare exchange insts, and enable multicore by default
* Fix comment typo
* Support Linux and OSX on MemoryAlloc and CompareExchange128, some cleanup
* Use intel syntax on assembly code
* Adjust identation
* Add CPUID check and fix exclusive reservation granule size
* Update schema multicore scheduling default value
* Make the cpu id check code lower case aswell