Support more detailed training configs and update official configs
Updated official README and configs
- More detailed instructions (PRs #55, #56)
- Restructured official configs (PR #55)
- Updated FT config for ImageNet (PR #55)
Support detailed training configurations
- Step-wise parameter update besides epoch-wise parameter update (PR #58)
- Gradient accumulation (PR #58)
- Max gradient norm (PR #58)