2401.01335_Self-Play: Fine-Tuning Converts Weak Language Models to Strong Language Models