QUICK REVIEW

[论文解读] Scheduling with a processing time oracle

Fanny Dufossé, Christoph Dürr|arXiv (Cornell University)|May 7, 2020

Scheduling and Optimization Algorithms参考文献 15被引用 19

一句话总结

本文研究单机调度问题，其中作业处理时间要么为短时间（p）要么为长时间（p+x），但初始未知；算法必须通过查询处理时间预言机（每次查询成本为一个时间单位）来揭示实际值。目标是最小化竞争比——即算法调度成本与已知全部信息时最优成本的比值。作者针对自适应与非自适应模型，提出最优的多项式时间算法，采用两阶段策略：先测试作业，再无差别地调度所有作业，通过动态规划与网格上边界路径计算，实现最小竞争比。

ABSTRACT

In this paper we study a single machine scheduling problem with the objective of minimizing the sum of completion times. Each of the given jobs is either short or long. However the processing times are initially hidden to the algorithm, but can be tested. This is done by executing a processing time oracle, which reveals the processing time of a given job. Each test occupies a time unit in the schedule, therefore the algorithm must decide for which jobs it will call the processing time oracle. The objective value of the resulting schedule is compared with the objective value of an optimal schedule, which is computed using full information. The resulting competitive ratio measures the price of hidden processing times, and the goal is to design an algorithm with minimal competitive ratio. Two models are studied in this paper. In the non-adaptive model, the algorithm needs to decide beforehand which jobs to test, and which jobs to execute untested. However in the adaptive model, the algorithm can make these decisions adaptively depending on the outcomes of the job tests. In both models we provide optimal polynomial time algorithms following a two-phase strategy, which consist of a first phase where jobs are tested, and a second phase where jobs are executed obliviously. Experiments give strong evidence that optimal algorithms have this structure. Proving this property is left as an open problem.

研究动机与目标

设计一种算法，以最小化在作业处理时间隐藏且仅能通过高成本预言机查询才能揭示时的单机调度竞争比。
比较自适应与非自适应策略的性能，其中自适应算法可根据先前结果决定测试哪些作业。
确定最优测试作业数量，以最小化相对于已知全部信息时最优调度的完成时间总和。
证明或证伪最优算法是否遵循两阶段结构：先测试作业，再无进一步查询地调度它们。

提出的方法

将问题建模为竞争分析框架，其中竞争比定义为算法调度成本与已知全部信息时最优调度成本的比值。
假设采用两阶段策略：第一阶段测试作业子集；第二阶段基于已知信息无进一步查询地调度所有作业（已测试与未测试）。
使用基于网格的动态规划方法计算最优策略，其中每个单元格 (c,d) 表示 c 个已测试的短作业与 d 个已测试的长作业。
通过标记单元格并迭代优化停止比 R* 的过程，计算网格上竞争比最小化的边界路径。
在自适应模型中，算法使用最优对手策略计算最小竞争比，最优算法则遵循此路径。
在 n=10 以内的作业规模上，对均匀采样的 p 与 x 值进行实验，以验证两阶段结构及算法性能。

实验结果

研究问题

RQ1在隐藏处理时间的单机调度中，自适应与非自适应模型下可实现的最小竞争比是多少？
RQ2是否每个最优算法都遵循两阶段策略——即先测试作业，再无差别地调度它们——无论是否自适应？
RQ3自适应性的增益（非自适应与自适应竞争比之差）如何随 p 与 x 变化？
RQ4最优策略能否高效计算？所提算法的时间复杂度是多少？
RQ5在自适应模型中，两阶段结构是否仍为最优？是否存在更复杂策略优于它的例外情况？

主要发现

所提算法在自适应与非自适应模型中均通过两阶段策略实现最小竞争比，理论时间复杂度为 O(n³)，实际为 O(n²)。
对最多 10 个作业的实验未发现违背两阶段猜想的反例，为该结构的最优性提供了强有力的实证支持。
自适应性的增益较小——约 2%——表明在此设定下，自适应决策相较于非自适应策略的改进有限。
竞争比随 x 增大（短作业与长作业差异增大）而上升，随 p 增大（短作业时间变长）而下降，且当 p 增大时趋于 1。
在自适应模型中，尽管大多数均衡调度遵循 (Tx)* (Tp)* (Ex)* (Ep)* 模式，但对较大 n 存在例外，表明无法利用简单结构进行进一步优化。
非自适应模型中的算法策略在 (x,p) 空间中表现出测试数量恒定的连通区域，而在自适应模型中此类区域呈不连通状态，表明决策边界更为复杂。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。