Multi-League Sports Scheduling with Team Interdependencies: An Optimization Model
Abstract
Every year, a large number of matches must be scheduled for professional and amateur sports teams. Several constraints have to be considered, including the overall capacity of venues and interdependencies between teams of the same club. As interdependent teams of a club play in different leagues, finding an optimal solution is very challenging for practitioners. While the problem of respecting capacity restrictions is well-addressed in prior work, interdependencies between teams are widely neglected, despite being a problem of major importance in practice. This paper enhances the formal definition of the multi-league-sports scheduling problem to take team interdependencies into account. We create an optimization problem to be solved by means of integer linear programming, and prove the corresponding decision problem to be -complete by a polynomial reduction from 3-SAT. An implementation which was used to schedule German table tennis leagues of a certain district demonstrates the practical applicability of the approach.
Keywords and phrases:
sports scheduling, linear optimization, constraint programming2012 ACM Subject Classification:
Theory of computation Constraint and logic programming ; Mathematics of computing Combinatorial optimizationSupplementary Material:
Software (source code): https://github.com/NilsWeidmann/KeyGenerator [35]
archived at
swh:1:dir:c668e66011bff623318334bb60799385ccb4357e
archived at
swh:1:dir:6b32a179b07ef602689a3c799174f938fe11b379
Editors:
Maria Garcia de la BandaSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction and Motivation
In several disciplines, sports teams play each other twice a season – once at home and once away – which is called a double round robin tournament. Usually, schedules for double round robin tournaments are created prior to the season, and define in which rounds (i.e., in which weeks) teams play each other. To assure fairness among teams, these schedules should be compact, such that teams play each week [4]. According to their skill, age and gender, teams are grouped into different leagues. Especially in amateur and youth sports, teams of a league are further subdivided into divisions according to regional aspects [28, 33].
Scheduling such divisions gets complicated when additional constraints have to be respected. When two teams share a venue, e.g., their home matches cannot take place in the same week. Prominent examples include AC and Inter Milan (football), New York Giants and Jets (American football) and Los Angeles Lakers and Sparks (basketball).
While creating schedules for a few professional leagues might be already challenging [12, 1], the problem gets more severe when scheduling hundreds of amateur divisions, where it is common practice that multiple teams of the same club share a common infrastructure that underlies capacity constraints. This scenario is denoted as the multi-league sports scheduling problem in the following. As the number of matches which can be hosted by a club at the same time is limited, and teams that share a venue usually play in different divisions, these divisions cannot be scheduled independently.
Apart from overall capacity constraints, there are also other interdependencies between teams of a club that ensure that they play at home in complementary weeks, or in the same weeks on different days, to ensure that players can attend matches of both teams [30]. Furthermore, schedule creation is usually a distributed task: Schedulers are responsible for all divisions of a specific region allotted to them [7], which makes it necessary to respect scheduling decisions that were made on super-ordinate (e.g., national) level.
1.1 Related Work
Davari et al. propose an algorithm that minimizes the number of capacity violations of a schedule, which does not respect team interdependencies, though [7]. For the special case of all teams of a club playing at home in the same weeks without allowing capacity constraint violations, they prove this version of the problem to be -complete via a polynomial reduction from vertex coloring. Li et al. extend this work by a computational study [22], a construction and improvement heuristic that minimizes constraint violations, and a multi-objective approach minimizing both traveling distances and constraint violations [23].
Other approaches rather aim at creating an optimal schedule for just a few divisions. Schönberger solves the scheduling problem on the level of single matches by means of mixed-integer programming, focusing on the optimization of player substitution opportunities [30, 31]. A computational study shows that solving the mixed-integer programming involves large efforts even for small instances. Creating optimal schedules for single (professional) leagues is a well-addressed topic in prior work [24, 9, 11, 2, 19, 34]. Further research concentrates on scheduling sports competitions based on other objectives, such as balancing and minimizing breaks [8, 26, 14, 3, 18, 13, 25], traveling distances [16, 10, 29, 15], required venues [32], scheduling matches in specific rounds [5], or introducing additional matches [20, 6].
To the best of our knowledge, none of the existing approaches aims at scheduling a large number of interdependent divisions, respecting both capacity constraints, team interdependencies and constraints from super-ordinate schedules.
1.2 Contribution and Paper Organization
In this paper, we address the identified research gap by extending the definition of the multi-league sports scheduling problem to take team interdependency constraints (which also have implications on the venue utilization) and constraints from super-ordinate schedules into account. The optimization goal is to minimize the overall number of deviations in actual and required schedules, whereby those are guaranteed to be at least similar for all teams. Thereby, the problem definition comes closer to the requirements of sports scheduling in practice. We prove the extended multi-league sports scheduling problem to be -complete by a polynomial reduction from 3-SAT, and underpin the practical applicability of our approach by scheduling German table tennis leagues of a certain district. Further experiments on generated data demonstrates the scalability of our approach.
The remainder of this paper is organized as follows: An overview of the problem domain and the proposed solution approach is provided in Sect. 2. The multi-league sports scheduling problem is formally defined in Sect. 3, before an integer linear program is constructed based on this definition in Sect. 4. The decision problem corresponding to multi-league sports scheduling problem is proven to be -complete in Sect. 5, and a practical implementation is sketched in Sect. 6. We apply our implementation to a real-world case study and evaluate its runtime performance in Sect. 7, before Sect. 8 summarizes the main results and sketches promising directions for future research.
2 Background and Solution Approach
Every year, schedules for thousands of (amateur) sports leagues have to be created, whereby matches between two teams take place at either of the teams’ venues. Typically, teams of the same club share a common infrastructure (venues, material, etc.). The number of simultaneous matches is thus restricted by the availability of the shared infrastructure. Oftentimes, teams playing home on different days must be considered separately (e.g., those playing home on Saturdays and Sundays), turning capacity constraints into team interdependencies.
It is desirable to find a schedule that respects those dependency constraints to the largest possible extent. To simplify the scheduling process, matches are not scheduled individually, but on the basis of grids that define the order of matches throughout the season. Matches are expressed in terms of key pairs, whereby the keys serve as placeholders for the teams of a division. A concrete example illustrating the basic terminology is presented in Sect. 3.1.
The use of uniform grids for each division makes it possible to express dependency constraints involving teams playing in different divisions, as well as synchronizing schedules for leagues of different levels (e.g., national, federal state and regional level). For instance, assigning the same key to teams in different divisions ensures that they play home in parallel, whereas other key combinations guarantee that two teams play home in opposite weeks.
The overall process is depicted in Fig. 1 in form of a unified modeling language activity diagram, and comprises the following four steps:
- ①
-
Before the schedule creation process starts, teams of a league must be grouped into divisions. Club administrators define interdependencies between teams of their clubs based on the availability of venues and/or tactical reasons via an internet portal. Regular match days and starting times for home matches are also defined in this step, which will come into play in step ④ when generating the match schedules.
- ②
-
A scheduler assigns keys to teams in a way that dependency constraints are respected to the largest possible extent. Note that keys for clubs of which at least one team plays on a super-ordinate level are provided as additional input for the key determination process, causing a loop in the data flow.
- ③
-
Conflict situations cannot be avoided completely in practice, as teams of neighboring clubs play against each other in multiple divisions. Thus, the scheduler resolves conflicts by assigning different, but similar keys to some teams, balancing interests among clubs.
- ④
-
Finally, the scheduler enters the generated keys into the internet portal, such that match schedules can be generated based on predefined uniform grids and the regular match days and starting times as defined in step ①.
Step ②, i.e., the determination of keys for schedule creation, is a hard task to solve: The keys to be assigned to teams of the same club must be aligned, while those teams are spread over multiple divisions, and each key can only be assigned once per division. In practice, due to the lack of proper conceptual and technical solutions, this process is done manually. Schedulers try to overcome the inherent complexity by applying a divide-and-conquer approach, leading to low-quality solutions [22]. This is problematic for several reasons: First, schedulers put large efforts in finding a barely acceptable solution, whereby efforts increase disproportionately with the number of divisions to schedule [17]. Second, clubs thereupon have to eliminate the effects of bad schedules, e.g., by postponing matches or renting further venues, resulting in additional efforts on their side.
To overcome these hurdles, we propose to formalize the task of sports schedule creation as a combinatorial optimization problem. Instead of manually determining suitable keys, the scheduler transfers the dependency constraints of all clubs allotted to their region to a key generator. This software tool shall pursue the optimization goal of minimizing the number of conflict situations, i.e., situations in which the key assigned to a team differs from the required key. The advantage of modeling the problem at hand in terms of combinatorial optimization is that established SAT or integer linear program solvers can be integrated into the key generator, promising high performance and reliability. The final decision of how to resolve conflicts should be left to the scheduler, though, to balance the interests of the involved stakeholders. As a result, high-quality schedules can be created with substantially reduced efforts for both schedulers and club administrators.
3 Problem Definition
This section introduces the general setting and terminology formally, before team interdependencies are specified. Finally, scheduling constraints for interdependent divisions are defined, resulting in a linear optimization problem.
3.1 General Setting and Terminology
The basic concepts of the problem definition comprise clubs, teams, and divisions. Each team belongs to exactly one club and one division.
Definition 1 (Club, Team, Division).
Let be a set of clubs and be a set of divisions. Let denote the set of teams of each club, and let denote the set of teams of each division. Let denote the set of all teams. Let , be functions that map teams to their respective club and division.
An example instance with 18 teams allocated to three divisions is shown in Tab. 1 (ignoring the columns “Key” and “WS” for the moment). Each team is represented by the name of its club, while most clubs have teams in multiple divisions. It is also possible that multiple teams of a club play in one division, such as two teams of “Austin” in Men’s division.
| Women’s division | Men’s division | Youth division | ||||||
|---|---|---|---|---|---|---|---|---|
| Key | Team | WS | Key | Team | WS | Key | Team | WS |
| 1 | Austin | 4 | Austin | 1 | Austin | |||
| 2 | Berkeley | 2 | Berkeley | 3 | Chicago | |||
| 6 | Chicago | 1 | Detroit | 2 | Detroit | |||
| 5 | Detroit | 6 | Green Bay | 5 | El Paso | |||
| 4 | El Paso | 5 | Houston | 6 | Green Bay | |||
| 3 | Fort Worth | 3 | Austin II | 4 | Houston | |||
double round robin tournaments comprise multiple rounds, which are usually scheduled in consecutive weeks. To create schedules, grids are used that determine which matches are scheduled in which week.
Definition 2 (Grid).
Let be a set of grids, with being the size of a grid . Let be a function that maps divisions to their grids. It holds that , i.e., the number of teams in a division must be less or equal to the grid size of the division.
An example grid for a division of up to six teams is depicted in Tab. 2. It can be read as follows: 1 - 6 in the first row of week W1 means that team 1 plays team 6 at home. The matches in W6 - W10 correspond to those in W1 - W5, swapping the home and away teams.
| W1 | W2 | W3 | W4 | W5 | W6 | W7 | W8 | W9 | W10 |
|---|---|---|---|---|---|---|---|---|---|
| 1 - 6 | 6 - 4 | 2 - 6 | 6 - 5 | 3 - 6 | 6 - 1 | 4 - 6 | 6 - 2 | 5 - 6 | 6 - 3 |
| 2 - 5 | 5 - 3 | 3 - 1 | 1 - 4 | 4 - 2 | 5 - 2 | 3 - 5 | 1 - 3 | 4 - 1 | 2 - 4 |
| 3 - 4 | 1 - 2 | 4 - 5 | 2 - 3 | 5 - 1 | 4 - 3 | 2 - 1 | 5 - 4 | 3 - 2 | 1 - 5 |
The numbers in a grid serve as placeholders for the teams of a division. In the following, these numbers are denoted as keys (an example allocation of teams to keys can be found in the respective columns of Tab. 1):
Definition 3 (Key).
Let be a set of keys for a grid .
3.2 Team Interdependencies
To ensure that venues can be shared between two teams, it is important that they do not play at home in the same week. Grids have opposite pairs of keys, e.g., , and in the example of Tab. 2. By choosing the keys for interdependent teams accordingly, it can be assured that at most one team plays at home each week.
Definition 4 (Opposite Keys).
Let be a total function that maps keys of a grid to their opposite. It holds:
| (1) |
Using such standardized grids, team interdependencies can be defined across multiple divisions. For grids of different sizes (which are necessary for divisions with different numbers of teams), there must be a notion of which keys correspond to each other, i.e., for which keys home games are scheduled in the same weeks. Keys fulfilling this property are denoted as parallel keys:
Definition 5 (Parallel Keys).
Let define a relation for parallel keys and (potentially) different grids. It holds:
| (2) |
In contrast to professional leagues, in which matches are often shown in the media, the order of matches or the allocation of matches to certain weeks is not that critical for amateur leagues. It is far more important for clubs to define which teams should have their matches at home in opposite weeks (such that they can share one venue) or in parallel weeks (such that players can attend matches of two teams, given they do not take place on the same day).
In order to express this requirement, week schemes are used: Teams of a club that follow the same week scheme play at home in the same weeks, whereas teams that follow an opposite week scheme play at home in opposite weeks. There is also a distinguished week scheme to express that there are no dependencies to other teams. This can happen if a club has only one team, or the capacity of its venue(s) is sufficient to host matches of all teams at the same time (without this being an explicit requirement for the club).
Definition 6 (Week Scheme).
Let and be two sets of week schemes. and are denoted as pairs of opposite week schemes. Let , with being a distinguished week scheme. Let be a function that maps teams to week schemes. Let be the set of teams of a club that follow the week scheme . It holds that
| (3) |
The “WS” column of Tab. 1 provides examples for week schemes. The women’s and youth teams of Detroit follow opposite week schemes and , whereas the week scheme for the men’s team indicates that there are no interdependencies to other teams. The women’s and men’s teams of El Paso follow independent week schemes and , meaning that there are no interdependencies to each other, but potentially to other teams of the club.
Apparently, each key can only be assigned once per division (the constraint will be specified formally in Def. 12). Without weakening the requirement for opposite and parallel keys, instances of realistic size are barely feasible. A simple example can be found in the men’s division of Table 1: Austin and Austin II follow the same week scheme , which means that they should be assigned the same key, which would directly lead to an infeasible system (which was already pointed out by Davari et al. in prior work [7]).
To overcome this non-exceptional problem, similar keys are defined for grids, meaning that only a few deviations from the intended sequence of home and away matches occur. In Tab. 2, one can observe that the keys and only differ in two weeks (W1 and W6), just as the keys and . Therefore, keys and , respectively, can be considered as similar keys that can be assigned in case the required key is not available.
Definition 7 (Similar Keys).
Let define a relation for similar keys of a grid . It holds that
| (4) |
3.3 Key Assignment
In this subsection, we elaborate on the assignment of keys to clubs (one key per week scheme), and based on that, the assignment of keys to teams. Before keys can be assigned to week schemes of clubs, it must be clear to which grid the keys refer. Regarding teams, the grid is predefined by the divisions the teams play in, whereas such an automatism does not exist for clubs. Therefore, a reference grid is defined for each pair of opposite week schemes.
Definition 8 (Reference Grid).
Let be a function that assigns a reference grid to each pair of opposite week schemes. It holds that
| (5) |
Based on the reference grid, keys can be assigned to week schemes of clubs according to Def. 9. Note that opposite keys must be assigned for opposite week schemes, which is formalized in Eq. 6.
Definition 9 (Key Assignment for Clubs).
Let be a total function that maps clubs to keys for a week scheme . It holds that
| (6) |
As mentioned in Sect. 1.1 and 1.2, the scheduling task is usually distributed between multiple schedulers for different levels (e.g., national, federal state and regional level). Despite having a potentially negative impact on the solution quality, this approach is indispensable to cope with the number of divisions that have to be scheduled, especially when schedules are created mostly manually. Usually, schedules for super-ordinate leagues are created first, and impose constraints for scheduling sub-ordinate leagues. In particular, key assignments for clubs that have teams in super-ordinate leagues are fixed in advance:
Definition 10 (Fixed Key Assignments for Clubs).
Let be a set of clubs, for which there exists a fixed key assignment. Let be a total function that maps these clubs to keys for a week scheme . It holds that
| (7) |
The assignment of keys to teams is specified in Def. 11.
Definition 11 (Key Assignment for Teams).
Let be a total function that maps teams to keys.
In the remainder of this section, a connection will be established between the key assignments of clubs and their respective teams. As mentioned beforehand, we must also ensure that each key is unique within a division, i.e., is assigned to at most one team per division, which is specified in Def. 12.
Definition 12 (Key Uniqueness).
| (8) |
A central requirement for clubs is that – while minor deviations are allowed – teams that follow the same week scheme are assigned keys in a way that they play at home in the same weeks. Therefore, the keys of clubs and their teams must be aligned with respect to the different week schemes. In particular, each team’s key must be similar to its club’s key of the respective week scheme, which is specified in Def. 13.
Definition 13 (Key Similarity).
| (9) |
Using the previous sets of constraints, we define satisfiability for problem instances of multi-league sports scheduling problem in Def. 14.
Definition 14 (Satisfiability).
An instance of multi-league sports scheduling problem is satisfiable, if:
| (10) |
While it is necessary in practice to assign similar keys in case the required key is unavailable, this should be avoided whenever possible, as it can, e.g., reduce the number of capacity violations. Thus, the optimization goal is to minimize the number of teams that receive a key that is not parallel to the key which is assigned to their club for the week scheme the team follows. The resulting optimization problem is specified in Def. 15.
Definition 15 (multi-league sports scheduling problem as an Optimization Problem).
| (11) |
4 Construction of the Integer Linear Program
In this section, the optimization problem defined in Sect. 3 is converted into a (binary) integer linear program that can be solved to optimality with a wide range of well-established solvers.
First, binary variables are associated with teams, representing the information whether a specific key is assigned to the team.
Definition 16 (Binary Variables for Teams).
Let be binary variables. Let
Likewise, binary variables are associated with clubs, adding the week scheme as a further dimension.
Definition 17 (Binary Variables for Clubs).
Let be binary variables. Let
Up to here, the information whether a team received the required or a similar key was expressed implicitly. As we want to minimize the number of teams for which this is the case, further binary variables are introduced that represent this conflict situation. They operate as auxiliary variables and will be essential for specifying the objective function.
Definition 18 (Binary Variables for Conflicts).
Let be binary variables.
In the remainder of this section, the necessary constraints for a feasible solution are set up, before the objective function completes the optimization problem. We start with a set of simple but indispensable constraints that ensure that every club is assigned exactly one key per week scheme (resembling the property of a total function in Def. 9).
| (12) |
Likewise, the same must hold for each team, as is a total function according to Def. 11.
| (13) |
Additionally, it must be ensured that each key is assigned at most once per division (cf. Def. 12 for the key uniqueness requirement).
| (14) |
Another set of constraints (15) ensures that opposite keys (cf. Def. 4) are assigned to opposite week schemes (cf. Def. 6).
| (15) |
For a subset of clubs, the key assignments are already fixed by external factors (cf. Def. 10), which is expressed by the following set of constraints:
| (16) |
It remains to encode the key similarity requirement (Def. 13) into linear constraints. Two aspects have to be taken into consideration: First, the keys for teams depend on the grid used in the respective divisions, whereas the clubs’ keys are determined by the reference grids per week scheme. Therefore, parallel keys have to be determined, as the grids in use are different in general. Second, the keys assigned to each of the teams must be similar (which includes equal) to one of the parallel keys determined in the previous step. As one valid key is sufficient for both steps and the previous constraints ensure that at most one key is assigned to each team and club/week scheme, all candidates in and , respectively, can be added up in (17).
| (17) |
While the previous set of constraints (17) guarantees that the keys assigned to teams are similar to the keys assigned to their clubs, we need to express that it is indeed more desirable to assign parallel keys (i.e., equal keys in case of equal grids) than only similar keys. Assigning a similar (but not parallel) key can be considered as the resolution of a conflict situation, as the key is required by another team of the same division as well. Therefore, the (auxiliary) binary variables for conflicts (Def. 18) can be used to “count” the number of teams to which a similar (but not parallel) key is assigned.
| (18) |
5 multi-league sports scheduling problem is -complete
The multi-league sports scheduling problem is defined as an optimization problem in this paper, whereas the complexity class is only defined for decision problems. The proof for -completeness will therefore refer to the decision problem, i.e., whether an instance of the multi-league sports scheduling problem is satisfiable according to Def. 14.
Lemma 19.
.
Proof.
To prove that holds, we must show that a witness for MLSSP can be verified within polynomial runtime. As a verifier, the constraint sets defined in Eq. (12) - (18) can be used. As many constraints involve a summation or iteration over a set of keys, which depends on the size of the respective grid, the maximum size of any grid in use is denoted as in the following. The runtime complexity of checking an inequality is , we further assume that the same holds for applying the functions (Def. 4) and (Def. 10), and evaluating the relations (Def. 5) and (Def. 7) using suitable hash functions.
Table 3 provides an overview of the runtime complexity of verifying constraints of Eq. (12) - (18). Note that the number of similar and parallel keys is bounded by for constraints (17) and (18).
| Constraint | Runtime Complexity |
|---|---|
| (12) | |
| (13) | |
| (14) | |
| (15) | |
| (16) | ) |
| (17) | = |
| (18) | = |
| Overall |
Overall, a witness for MLSSP can be verified in polynomial time, proving Lemma 19.
Lemma 20.
.
Proof.
We prove that multi-league sports scheduling problem is -hard by a polynomial reduction from 3-SAT, i.e., . Let be a formula of propositional logic in 3-CNF with clauses . Without loss of generality, we assume each clause to consist of exactly three literals (clauses with less than three literals can be brought into this form by doubling or tripling the literals present).
A formula is constructed by adding a variable – which does not occur in – to each clause of , and add a clause , such that and are equisatisfiable:
| (20) |
Based on , an multi-league sports scheduling problem instance is constructed in the following way:
-
-
-
, with
-
, , ,
-
, , , , ,
-
, , , , , , , , , , , , ,
-
A club is created for each distinct , and a club is created for
-
,
-
For each clause in (except for the last clause ), a division with four teams and is created. Each team corresponds to a literal in , . Each team corresponds to the literal .
-
Teams corresponding to negative literals follow the week scheme , those corresponding to positive literals follow week scheme .
Proposition 21.
is satisfiable
Proof.
For each club , let true is assigned to in , and false is assigned to in . holds if at least one literal of each clause in yields true. This means that in each clause, there must be at least one positive literal with variable assignment true or one negative literal with variable assignment false.
Equivalently, to satisfy , there must be at least one team of a club following week scheme with and , or following week scheme with and in each division of , resulting in at least one occurrence of . The same must hold vice versa (at least one occurrence of ) for each division of , which is trivially fulfilled because and one team of club following week scheme is present in each division .
This observation is depicted in Tab. 4, showing a case distinction for the eight possible cases. The columns contain the key assignments for the clubs , while the columns contain the sets of possible key assignments for the teams according to the relations and . A valid key assignment for the teams is highlighted in bold font for the first seven cases. Only in the last case, which corresponds to all variables of positive literals being assigned false and all variables of negative literals being assigned true in the respective clause of , no valid key assignment is possible for the teams of , proving Proposition 21.
| valid | ||||||||
|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 2 | 1/2/4 | 1/2/4 | 1/2/4 | 2/3/4 | ✓ |
| 1 | 1 | 2 | 2 | 1/2/4 | 1/2/4 | 2/3/4 | 2/3/4 | ✓ |
| 1 | 2 | 1 | 2 | 1/2/4 | 2/3/4 | 1/2/4 | 2/3/4 | ✓ |
| 1 | 2 | 2 | 2 | 1/2/4 | 2/3/4 | 2/3/4 | 2/3/4 | ✓ |
| 2 | 1 | 1 | 2 | 2/3/4 | 1/2/4 | 1/2/4 | 2/3/4 | ✓ |
| 2 | 1 | 2 | 2 | 2/3/4 | 1/2/4 | 2/3/4 | 2/3/4 | ✓ |
| 2 | 2 | 1 | 2 | 2/3/4 | 2/3/4 | 1/2/4 | 2/3/4 | ✓ |
| 2 | 2 | 2 | 2 | 2/3/4 | 2/3/4 | 2/3/4 | 2/3/4 | ✗ |
Theorem 22.
multi-league sports scheduling problemmissing is -complete.
Proof.
There are some practical implications of the proof result, which will briefly discussed in the following. The key learning is that for very large instances, some multilevel divide-and-conquer approach will still be necessary to schedule leagues from national to regional level, even with tool support. For very large problem instances (e.g., an instance involving all football leagues of Spain), the scheduling problem gets intractable from a practical point of view, even if all required data was accessible to a single person. Heuristic approaches might succeed finding a sufficiently good solution, but usually need a basic feasible solution to start with, though, which itself is challenging to determine for the multi-league sports scheduling problem.
6 Implementation
Our approach is implemented as a C# desktop application [35]. The software architecture of our implementation is sketched in Fig. 2 in form of a unified modeling language component diagram.
Using the frontend & data acquisition component, the grouping of teams into divisions and the dependency constraints provided via an internet portal can be imported into the key generator via comma-separated values files. Fixed assignments must be entered manually. The data persistence layer processes data of clubs, divisions and teams, filters out information that is irrelevant for the match scheduling task and stores the relevant information on disk. The optimization model builder constructs an integer linear program based on the input data and the information on parallel, opposite and similar keys for each grid in use (grids for 6 - 14 teams are supported). The integer linear program is subsequently solved to optimality by a SAT or integer linear program solver. We use the CP-SAT solver[27] in our implementation. With the conflict resolution component, the user can modify the computed solution by deciding which teams should receive similar keys in conflict situations. This makes it possible to balance interests between the clubs involved, while the solution quality remains unchanged. The data acquisition and conflict resolution components form the user interface, whereas the other components are hidden in the back end.
7 Evaluation
In this section, the practical applicability of our approach shall be investigated. Particularly, the following research questions will be answered in the course of this section:
-
Q1
Can the approach be used to determine solutions for scheduling instances of realistic size?
-
Q2
Which factors influence the scalability of the approaches, i.e., which characteristics of the input affect the runtime and solution quality of the key generation process?
-
Q3
How does the approach perform compared to similar existing approaches?
Outline.
Q1 will be investigated by solving a real-world case study involving German table tennis leagues (Sect. 7.1). To answer Q2, the approach is applied on generated instances with different characteristic parameters (Sect. 7.2). Finally, we compare our approach to an implementation by Li et al. [22] to answer Q3 (Sect. 7.3).
Experimental setup.
All experiments were run on a Windows 11 notebook with a 2.0 GHz Intel i7 processor and 32 GB main memory. The CP-SAT solver was run with 8 parallel workers (i.e., threads) and a time-out of 300 seconds. To reduce the effect of outliers, the experiments were run 10 times (Q1), or on 10 different instances that were generated with the same characteristic parameters (Q2 and Q3), respectively. From these 10 test runs, the median values of the measured runtimes and objective function values were computed.
7.1 Case Study
We applied our approach to create the schedules for a district of the West-German Table Tennis Association for three seasons [36]. The characteristics of the three input instances are summed up in Tab. 5. They differ with respect to the number of teams, clubs, and divisions involved. It can be further observed that for the majority of teams, dependencies were defined (i.e., week schemes other than according to Def. 6 were used). Five week schemes were available for the clubs as predefined by the West-German Table Tennis Association. For a few clubs, key assignments were fixed by the schedulers of super-ordinate levels according to Def. 10.
| Season | # divi- | # teams (with | # clubs (with | # week | Runtime | Objective |
|---|---|---|---|---|---|---|
| sions | dependencies) | fixed assignm.) | schemes | (in s) | fun. value | |
| 2022/23 | 94 | 885 (638) | 194 (18) | 5 | 2.876 | 35 |
| 2023/24 | 49 | 516 (387) | 107 (16) | 5 | 2.145 | 16 |
| 2024/25 | 50 | 496 (392) | 106 (11) | 5 | 2.225 | 14 |
For all three instances, the application returns an optimal solution within a few seconds in all ten test runs. It can be concluded that the implementation is suitable for solving instances of realistic size. Limitations will be investigated in the following performance evaluation.
7.2 Performance Evaluation
As the implementation comprises an exact approach to determine the scheduling solution, the runtime is expected to increase exponentially for larger inputs. Other characteristic parameters (cf. Tab. 5) may also have an impact on the runtime and solution quality.
Base scenario construction.
To investigate the approach’s scalability, we derive a base scenario from the real-world case study of Sect. 7.1, which is subsequently varied with respect to its characteristic parameters. The base scenario consists of 100 divisions with 10 teams per division, which belong to 200 clubs. For 75% of the teams, dependencies are defined, and for 10% of the clubs, key assignments are fixed prior to the optimization. The number of week schemes is set to 5 (including ) in the base scenario.
Instance generation.
To analyze the effect of each characteristic parameter, their values are varied separately, resulting in 30 different configurations (cf. Tab. 6). For each configuration, 10 instances were generated at random [36] and handed over to CP-SAT solver.
| Characteristic | Runtime (in s) | Objective function value | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| # Divisions | 20 | 50 | 100 | 200 | 500 | 20 | 50 | 100 | 200 | 500 |
| 0.35 | 2.79 | 55.02 | T/O | T/O | 1 | 4 | 10 | 257 | 1202 | |
| # Teams per division | 6 | 8 | 10 | 12 | 14 | 6 | 8 | 10 | 12 | 14 |
| 0.28 | 1.13 | 2.87 | 2.97 | 3.40 | 2 | 2.5 | 6 | 8 | 8.5 | |
| # Clubs | 20 | 50 | 100 | 200 | 500 | 20 | 50 | 100 | 200 | 500 |
| T/O | 2.57 | 2.93 | 1.93 | 0.62 | 66.5 | 7 | 2.5 | 2.5 | 1.5 | |
| # Week schemes | 3 | 5 | 7 | 9 | 11 | 3 | 5 | 7 | 9 | 11 |
| 12.04 | 2.43 | 2.35 | 2.24 | 1.71 | 10 | 5 | 4 | 3 | 3 | |
| % Teams with dependencies | 75 | 80 | 85 | 90 | 95 | 75 | 80 | 85 | 90 | 95 |
| 2.73 | 2.55 | 2.60 | 2.88 | 3.74 | 5 | 4 | 4.5 | 5 | 7 | |
| % Clubs with fixed assignm. | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
| 2.55 | 2.56 | 1.74 | 1.47 | 1.64 | 4.5 | 4 | 19 | 25.5 | 40 | |
Results.
The median values for runtime and solution quality are depicted in Tab. 6, while the measured values for each test run are listed in Appendices A.1 and A.2. For all conducted test runs, the CP-SAT solver returned an optimal or feasible solution, or classified the problem as infeasible. The values measured for infeasible instances were excluded from computing the median values.
The largest impact with respect to runtime and solution quality was observed when varying the number of divisions. For 200 and 500 divisions, the solver reached the time-out in all 10 runs, whereby the solution quality deteriorated substantially. This seems plausible, as the number of divisions corresponds to the size of the constructed integer linear program. Considering the number of teams per division, both runtime and solution quality deteriorate when increasing the division size. The effects are less pronounced than the effects of increasing the number of divisions, though, which can be explained by the increased flexibility of larger division sizes regarding key assignment.
The measured values for both runtime and solution quality show that reducing the number of clubs makes the problem harder. This seems valid, because an increased number of teams per club leads to more teams of a club following the same week scheme, severely restricting the possibilities of assigning suitable keys. Similarly, reducing the number of week schemes apparently increases the problem complexity, as more teams follow the same week scheme.
Increasing the portion of teams with dependencies slightly increases the runtime consumption at almost constant objective function values. The most noticeable difference occurs when increasing the portion to 95%, presumably caused by lacking flexibility regarding key assignment. Interestingly, increasing the portion of clubs with fixed assignments reduces the runtime but also the solution quality. An explanation could be that fixed assignments reduce the search space, but also cause unavoidable conflict situations.
In theory, one would expect the same runtime measurements and objective function values in all cells of Tab. 6 which represent the base scenario. When conducting performance tests, we varied each characteristic parameter separately, such that the base scenario occurred multiple times. As different instances were generated, the measurement results slightly differ.
7.3 Comparison to Similar Approaches
We compared our implementation to the approach of Li et al. [22], who determine optimal schedules based on capacity constraints without defining interdependencies between teams. They implemented both a heuristic and an exact approach and applied them on instances with 3,5 and 10 divisions with different numbers of teams (cf. Tab. 7). All instances were made publicly available [21], such that we were able to implement an adapter for their file format and run experiments on the same input data.
| Approach | Runtime (in s) per instance type | |||||
|---|---|---|---|---|---|---|
| 3-1 | 3-2 | 5-1 | 5-2 | 10-1 | 10-2 | |
| Li et al. (exact) | 66.33 | 16.55 | 588.10 | 803.71 | 678.28 | 528.63 |
| Li et al. (heuristic) | 0.02 | 0.11 | 0.79 | 0.01 | 3.97 | 0.93 |
| This approach | 0.16 | 0.29 | 0.05 | 0.06 | 0.09 | 0.10 |
Optimal results could be determined in all 60 test runs. The median values of the measured runtimes for 10 instances which were constructed with equal characteristics are shown in Tab. 7. Our implementation performs relatively well compared to the heuristic approach and clearly outperforms the exact approach. Interestingly, our approach takes most time for the instances with only three divisions, which can be explained by the reduced possibilities to avoid conflict situations.
The exact approach of Li et al. and our approach share the optimization goal (minimizing constraint violations) and the general setting (teams play double round robin tournament tournaments in several groups, clubs’ venues have capacity constraints). The main reason for the differences in runtime performance is that the search space constructed by Li et al. is much larger, as they use decision variables for every single match, while our approach operates on the level of assigning keys to teams. Li et al. can potentially further reduce the number of constraint violations via a sophisticated allocation of bye weeks to teams, or by stretching the schedule of smaller divisions to more weeks than necessary. However, our approach allows clubs to define that two specific teams should play in turn or in parallel, while Li et al. solely aim at minimizing the number of capacity constraint violations.
7.4 Summary and Threats to Validity
Revisiting our research questions, it can be stated that our approach is suitable to be applied on real-world instances (Q1). Its scalability is mainly restricted by the number of divisions and teams involved (Q2), but performs well in comparison to existing approaches (Q3).
The conducted experiments have a few limitations to be discussed in the following. Due to the lack of real-world examples, all instances used in the experiments of the performance evaluation (Sect. 7.2) were artificially generated. The time-out was set to a rather low value, which was indeed sufficient to identify whether modifying a characteristic parameter increases or decreases runtime and solution quality. Using more than one solver for the experiments and varying the characteristic parameters in combination could give more reliable insights into performance bottlenecks.
Furthermore, the differences with respect to runtime and solution quality often varied noticeably between test runs of the same configuration (cf. Appendices A.1 and A.2). This implies that the grouping of teams into divisions has a major impact on the problem complexity, which was not covered by the performance analysis. Usually, teams are not distributed randomly over the groups, but according to regional aspects. This means that the generated instances are probably harder to solve than practical use cases.
Finally, the presented evaluation was purely technical, whereas the validity of the applied quality metrics should be evaluated in the course of a user study.
8 Conclusion and Future Work
This paper defines an extended version of the Multi-League Sports Scheduling Problem (MLSSP) that takes capacity constraints, team interdependencies and constraints from super-ordinate schedules into account. The MLSSP is specified in form of a linear optimization problem, the corresponding decision problem is proven to be -complete. The approach was applied to scheduling German table tennis leagues of a certain district, underpinning its practical applicability. Experiments with generated instances of varying size highlight the factors influencing the approach’s scalability.
In future research, meta-heuristics such as simulated annealing, tabu search or genetic algorithms should be implemented to handle even larger instances. Regarding the problem definition, the similarity requirement of assignments for clubs and teams could be replaced by a requirement for pairwise similarity of assignments for teams of the same club, leading to better results in practice. Further constraints, such as inter-club dependencies or requirements for teams playing home or away in certain weeks should be easy to include. Finally, the validity of quality measures should be assessed in an empirical evaluation with practitioners.
References
- [1] Fernando Alarcón, Guillermo Durán, Mario Guajardo, Jaime Miranda, Hugo Muñoz, Luis Ramírez, Mario Ramírez, Denis Sauré, Matías Siebert, Sebastian Souyris, Andrés Weintraub, Rodrigo Wolf-Yadlin, and Gonzalo Andres Zamorano. Operations Research Transforms the Scheduling of Chilean Soccer Leagues and South American World Cup Qualifiers. Interfaces, 47(1):52–69, 2017. doi:10.1287/INTE.2016.0861.
- [2] Thomas Bartsch, Andreas Drexl, and Stefan Kröger. Scheduling the professional soccer leagues of Austria and Germany. Computers & Operations Research, 33(7):1907–1937, 2006. doi:10.1016/J.COR.2004.09.037.
- [3] Dirk Briskorn. Alternative IP Models for Sport Leagues Scheduling. In Jörg Kalcsics and Stefan Nickel, editors, Operations Research, Proceedings 2007, Selected Papers of the Annual International Conference of the German Operations Research Society (GOR). Springer, 2007. doi:10.1007/978-3-540-77903-2_62.
- [4] David Van Bulck and Dries R. Goossens. Handling fairness issues in time-relaxed tournaments with availability constraints. Comput. Oper. Res., 115:104856, 2020. doi:10.1016/J.COR.2019.104856.
- [5] Wayne Burrows and Christopher P. Tuffley. Maximising common fixtures in a round robin tournament with two divisions. Australas. J Comb., 63:153–169, 2015. URL: http://ajc.maths.uq.edu.au/pdf/63/ajc_v63_p153.pdf.
- [6] Mats Carlsson, Mikael Johansson, and Jeffrey Larson. Scheduling double round-robin tournaments with divisional play using constraint programming. European Journal of Operational Research, 259(3):1180–1190, 2017. doi:10.1016/J.EJOR.2016.11.033.
- [7] Morteza Davari, Dries R. Goossens, Jeroen Beliën, Roel Lambers, and Frits C. R. Spieksma. The multi-league sports scheduling problem, or how to schedule thousands of matches. Oper. Res. Lett., 48(2):180–187, 2020. doi:10.1016/J.ORL.2020.02.004.
- [8] Dominique de Werra, Loïc Jacot-Descombes, and P. Masson. A constrained sports scheduling problem. Discret. Appl. Math., 26(1):41–49, 1990. doi:10.1016/0166-218X(90)90019-9.
- [9] Federico Della Croce and Dario Oliveri. Scheduling the Italian football league: An ILP-based approach. Computers & Operations Research, 33(7):1963–1974, 2006. doi:10.1016/J.COR.2004.09.025.
- [10] Guillermo Alfredo Durán, Mario Guajardo, Agustina F. López, Javier Marenco, and Gonzalo Andres Zamorano. Scheduling Multiple Sports Leagues with Travel Distance Fairness: An Application to Argentinean Youth Football. INFORMS J. Appl. Anal., 51(2):136–149, 2021. doi:10.1287/INTE.2020.1048.
- [11] Dries Goossens. Optimization in sports league scheduling: experiences from the Belgian Pro League soccer. In International conference on operations research and enterprise systems, pages 3–19. Springer, 2017.
- [12] Dries R. Goossens and Frits C. R. Spieksma. Scheduling the Belgian Soccer League. Interfaces, 39(2):109–118, 2009. doi:10.1287/INTE.1080.0402.
- [13] Dries R. Goossens and Frits C. R. Spieksma. Breaks, cuts, and patterns. Oper. Res. Lett., 39(6):428–432, 2011. doi:10.1016/J.ORL.2011.09.001.
- [14] Mark R. Grabau. Softball Scheduling as Easy as 1-2-3 (Strikes You’re Out). Interfaces, 42(3):310–319, 2012. doi:10.1287/INTE.1110.0559.
- [15] Xiaoyun Ji and John E Mitchell. Finding optimal realignments in sports leagues using a branch-and-cut-and-price approach. Int. J. Oper. Res., 1(1-2):101–122, 2005.
- [16] Graham Kendall. Scheduling English football fixtures over holiday periods. J. Oper. Res. Soc., 59(6):743–755, 2008. doi:10.1057/PALGRAVE.JORS.2602382.
- [17] Fabian Kleintges-Topoll. Blog: Geringe Auswirkungen durch Strukturreform. https://www.mytischtennis.de/public/mytt-bloggt/18283/, 2023. Accessed: 13 Mar 2025.
- [18] Sigrid Knust and Daniel Lücking. Minimizing costs in round robin tournaments with place constraints. Computers & Operations Research, 36(11):2937–2943, 2009. doi:10.1016/J.COR.2009.01.004.
- [19] Jari Kyngäs, Kimmo Nurmi, Nico Kyngäs, George Lilley, Thea Salter, and Dries Goossens. Scheduling the Australian football league. Journal of the Operational Research Society, 68(8):973–982, 2017. doi:10.1057/S41274-016-0145-8.
- [20] Jeffrey Larson and Mikael Johansson. Constructing schedules for sports leagues with divisional and round-robin tournaments. Journal of Quantitative Analysis in Sports, 10(2):119–129, 2014.
- [21] Miao Li, Morteza Davari, and Dries R. Goossens. Multi League Repository. https://www.sportscheduling.ugent.be/RobinX/multiLeagueRepo.php, 2023.
- [22] Miao Li, Morteza Davari, and Dries R. Goossens. Multi-league sports scheduling with different leagues sizes. Eur. J. Oper. Res., 307(1):313–327, 2023. doi:10.1016/J.EJOR.2022.10.010.
- [23] Miao Li and Dries Goossens. Grouping and scheduling multiple sports leagues: an integrated approach. Journal of the Operational Research Society, pages 1–19, 2024.
- [24] Vardges Melkonian. An Integer Programming Approach for Scheduling a Professional Sports League. American Journal of Computational Mathematics, 14(4):401–423, 2024.
- [25] Ryuhei Miyashiro, Hideya Iwasaki, and Tomomi Matsui. Characterizing Feasible Pattern Sets with a Minimum Number of Breaks. In Edmund K. Burke and Patrick De Causmaecker, editors, Practice and Theory of Automated Timetabling IV, 4th International Conference, PATAT 2002, pages 78–99. Springer, 2002. doi:10.1007/978-3-540-45157-0_5.
- [26] Douglas Moody, Graham Kendall, and Amotz Bar-Noy. Youth sports leagues scheduling. In Proceedings of the 8th International Conference on the Practice and Theory of Automated Timetabling, pages 283–293, 2010.
- [27] Laurent Perron and Frédéric Didier. CP-SAT. https://developers.google.com/optimization/cp/cp_solver/.
- [28] Diego Recalde, Ramiro Torres, and Polo Vaca. Scheduling the professional Ecuadorian football league by integer programming. Comput. Oper. Res., 40(10):2478–2484, 2013. doi:10.1016/J.COR.2012.12.017.
- [29] Robert M Saltzman and Richard M Bradford. Optimal realignments of the teams in the National Football League. European Journal of Operational Research, 93(3):469–475, 1996.
- [30] Jörn Schönberger. Scheduling of sport league systems with inter-league constraints. Technical report, Diskussionsbeiträge aus dem Institut für Wirtschaft und Verkehr, 2015.
- [31] Jörn Schönberger. The championship timetabling problem-construction and justification of test cases. In Proceedings of MathSport International 2017 Conference, volume 330, 2017.
- [32] Ling-Huey Su, Yufang Chiu, and TC Edwin Cheng. Sports tournament scheduling to determine the required number of venues subject to the minimum timeslots under given formats. Computers & Industrial Engineering, 65(2):226–232, 2013. doi:10.1016/J.CIE.2013.02.021.
- [33] Túlio A. M. Toffolo, Jan Christiaens, Frits C. R. Spieksma, and Greet Vanden Berghe. The sport teams grouping problem. Ann. Oper. Res., 275(1):223–243, 2019. doi:10.1007/S10479-017-2595-Z.
- [34] David Van Bulck, Dries R Goossens, and Frits CR Spieksma. Scheduling a non-professional indoor football league: a tabu search based approach. Ann. Oper. Res., 275(2):715–730, 2019. doi:10.1007/S10479-018-3013-X.
- [35] Nils Weidmann. KeyGenerator. https://github.com/NilsWeidmann/KeyGenerator.
- [36] Nils Weidmann. Multi-League Sports Scheduling. https://github.com/NilsWeidmann/Multi-League-Sports-Scheduling, 2025.
Appendix A Appendix
A.1 Performance Evaluation Results - Runtime (in s)
• Optimal • Feasible • Infeasible
A.2 Performance Evaluation Results - Objective Function Values
• Optimal • Feasible • Infeasible
