Study of Methods for One-dimensional Bin Packing Problem
Ponente(s): Jessica Elena González San Martín, Laura Cruz-Reyes
The Bin Packing Problem (BPP) is a classic optimization problem that is known for its applicability and complexity, which belongs to a particular class of problems called NP-hard, in which, given a set of items of variable size, we search to accommodate them inside fixed size containers, seeking to optimize the number of containers to be used, that is, using the least number of containers to place the largest number of items possible. For this problem, there are different variants with respect to the dimensions of objects (1D-BPP, 2D-BPP and 3D-BPP), number of objectives to satisfied (Multiobjective) and changes on time (Dynamic). In this study we will focus only on the One-dimensional Bin Packing Problem (1D-BPP), because it is the base problem for the multidimensional, multiobjective and dynamic variants and has been preserved as a current study problem due to the various applications that it offers. In the current state of the art for 1D-BPP, there are different algorithms, mainly heuristics for solving the problem, however, there is no heuristic or metaheuristic algorithm capable of finding the optimal solution for all possible instances of a problem of this type despite the scientific community’s efforts. This work presents a study of methods and strategies that have been used to address 1D-BPP in the last two decades, in order to identify the most promising ones regarding algorithmic performance. The main objective is that this study can help both researchers and professionals interested in using specific components or techniques that help improve the behavior of an algorithm to solve this problem. Among the strategies implemented in recent years we find the hybridization of metaheuristic algorithms with machine learning, with reinforcement learning (RL) being the most used, which is an excellent alternative to dynamically adapt the search for these heuristics by training an agent in a supervised or self-supervised manner [1]. A reinforcement learning agent tries to learn the behavior through trial-and-error interactions in a dynamic and uncertain environment. The agent does not know what actions to take, and it must automatically discover which actions produce the maximum benefit. In addition, it must be able to adapt to changes that occur in the environment. On each interaction, the agent receives an indication of its current state and selects an action. The action changes state and the agent receives a reinforcement or reward signal. The agent's goal is to find a policy that associates states with actions, maximizing the long-term accumulated reward.
Currently, we see the field of RL for combinatorial optimization (CO) problems as a promising direction for this research line because of the effectiveness in terms of the solution quality, the capacity to outperform the existing algorithms, and huge running time gains compared to the classical heuristic approaches.
We present a grouping genetic algorithm as case of study. In the proposed version, a new strategy was implemented incorporating reinforcement learning to one of the genetic operators. According to the results obtained in the experimentation, the proposed version improves the performance of the original algorithm by 6.23%, solving 80 more instances and reducing the execution time. With this work, it was possible to identify some promising strategies that could be addressed in the future for 1D-BPP and obtain a greater impact on performance if are combined or implemented in conjunction with other methods.