A BI LEVEL REINFORCEMENT LEARNING MODEL FOR OPTIMAL SCHEDULING AND