In the daily dispatching of last–mile urban delivery, a delivery manager has to consider workload balance among couriers to maintain workforce morale. We consider two types of workload: incentive workload, which relates to the delivery quantity and affects a courier’s income, and effort workload, which relates to the delivery time and affects a courier’s health. Incentive workload has to be balanced over a relatively long period of time (a payroll cycle—a week or a month), whereas effort workload has to be balanced over a relatively short period of time (a shift or a day). We introduce a multi-period workload balancing problem under stochastic demand and dynamic daily dispatching, formulate it as a Markov decision process (MDP), and derive a lower bound on the optimal value of the MDP model. We propose a balanced penalty policy based on cost function approximation and use a hybrid algorithm combining the modified nested partitions method and the KN++ procedure to search for an optimal policy parameter. A comprehensive numerical study demonstrates that the proposed balanced penalty policy performs close to optimal on small instances and outperforms four benchmark policies on large instances, and provides insight into the impact of demand variation and a manager’s importance weighting of operating cost and workload balance.