publications | Juan Cruz Barsce's personal website

2023

Chapter 14 - Simulation-based generation of rescheduling knowledge using a cognitive architecture

Jorge Andrés Palombarini, Juan Cruz Barsce, and Ernesto Carlos Martı́nez

In Designing Smart Manufacturing Systems, 2023

Abs

No schedule can withstand the test of time. Unexpected disruptions are ubiquitous in manufacturing shop-floors and supply chains. Automating rescheduling decisions is thus essential to increase the type and level of autonomy used to respond timely to unplanned events. Problem-specific rescheduling knowledge is often scarce or unavailable. In this work, based on the Soar cognitive architecture, simulated transitions between schedule states due to repair operators are used for generating and compiling knowledge to respond reactively to unforeseen events generated by internal and external factors such as machine breakdowns, rush orders, material shortages and the need for reprocessing operations. Rescheduling knowledge is obtained in the form of dynamic first-order logical decision rules which can be applied in real-time to repair a schedule and guarantee feasibility. The simulation-based approach is implemented using the Problem Space Computational Model formalism, which readily integrates reinforcement learning algorithms with artificial cognitive capabilities such as memorization, chunking, and reasoning in a rescheduling agent. The Soar architecture is used to learn rescheduling rules for an industrial case study.

2021

Automatic Tuning of Hyper-Parameters of Reinforcement Learning Algorithms Using Bayesian Optimization with Behavioral Cloning

Juan Cruz Barsce, Jorge A. Palombarini, and Ernesto C. Martínez

arXiv:2112.08094 [cs], Dec 2021

Abs

Optimal setting of several hyper-parameters in machine learning algorithms is key to make the most of available data. To this aim, several methods such as evolutionary strategies, random search, Bayesian optimization and heuristic rules of thumb have been proposed. In reinforcement learning (RL), the information content of data gathered by the learning agent while interacting with its environment is heavily dependent on the setting of many hyper-parameters. Therefore, the user of an RL algorithm has to rely on search-based optimization methods, such as grid search or the Nelder-Mead simplex algorithm, that are very inefficient for most RL tasks, slows down significantly the learning curve and leaves to the user the burden of purposefully biasing data gathering. In this work, in order to make an RL algorithm more user-independent, a novel approach for autonomous hyper-parameter setting using Bayesian optimization is proposed. Data from past episodes and different hyper-parameter values are used at a meta-learning level by performing behavioral cloning which helps improving the effectiveness in maximizing a reinforcement learning variant of an acquisition function. Also, by tightly integrating Bayesian optimization in a reinforcement learning agent design, the number of state transitions needed to converge to the optimal policy for a given task is reduced. Computational experiments reveal promising results compared to other manual tweaking and optimization-based approaches which highlights the benefits of changing the algorithm hyper-parameters to increase the information content of generated data.

2020

A Hierarchical Two-tier Approach to Hyper-parameter Optimization in Reinforcement Learning

Juan Cruz Barsce, Jorge Palombarini, and Ernesto Martinez

Electronic Journal of SADIO (EJS), May 2020

Abs

Resumen Optimization of hyper-parameters in real-world applications of reinforcement learning (RL) is a key issue, because their settings determine how fast the agent will learn its policy by interacting with its environment due to the information content of data gathered. In this work, an approach that uses Bayesian optimization to perform an autonomous two-tier optimization of both representation decisions and algorithm hyper-parameters is proposed: first, categorical / structural RL hyper-parameters are taken as binary variables and optimized with an acquisition function tailored for such type of variables. Then, at a lower level of abstraction, solution-level hyper-parameters are optimized by resorting to the expected improvement acquisition function, whereas the categorical hyper-parameters found in the optimization at the upper level of abstraction are fixed. This two-tier approach is validated with a tabular and neural network setting of the value function, in a classic simulated control task. Results obtained are promising and open the way for more user-independent applications of reinforcement learning.

2019

A Hierarchical Two-tier Approach to Hyper-parameter Optimization in Reinforcement Learning

Juan Cruz Barsce, Jorge A. Palombarini, and Ernesto Martínez

In Anales Del Simposio Argentino de Inteligencia Artificial (ASAI) 2019, Sep 2019

Abs

Optimization of hyper-parameters in reinforcement learning (RL) algorithms is a key task, because they determine how the agent will learn its policy by interacting with its environment, and thus what data is gathered. In this work, an approach that uses Bayesian optimization to perform a two-step optimization is proposed: first, categorical RL structure hyper-parameters are taken as binary variables and optimized with an acquisition function tailored for such variables. Then, at a lower level of abstraction, solution-level hyper-parameters are optimized by resorting to the expected improvement acquisition function, while using the best categorical hyper-parameters found in the optimization at the upper-level of abstraction. This two-tier approach is validated in a simulated control task. Results obtained are promising and open the way for more user-independent applications of reinforcement learning.

2017

Towards Autonomous Reinforcement Learning: Automatic Setting of Hyper-Parameters Using Bayesian Optimization

J. C. Barsce, J. A. Palombarini, and E. C. Martínez

In 2017 XLIII Latin American Computer Conference (CLEI), Sep 2017

Abs

With the increase of machine learning usage by industries and scientific communities in a variety of tasks such as text mining, image recognition and self-driving cars, automatic setting of hyper-parameter in learning algorithms is a key factor for obtaining good performances regardless of user expertise in the inner workings of the techniques and methodologies. In particular, for a reinforcement learning task, the efficiency of an agent learning a policy in an uncertain environment has a strong dependency on how hyper-parameters in the algorithm are set. In this work, an autonomous framework that employs Bayesian optimization and Gaussian process regression to optimize the hyper-parameters of a reinforcement learning algorithm is proposed. A gridworld example is discussed in order to show how hyper-parameter configurations of a learning algorithm (SARSA) are iteratively improved based on two performance functions.

2013

A Cognitive Approach to Real-Time Rescheduling Using SOAR-RL

Juan Cruz Barsce, Jorge Palombarini, and Ernesto Martínez

In XVIII Congreso Argentino de Ciencias de La Computación, Oct 2013

Abs

Ensuring flexible and efficient manufacturing of customized products in an increasing dynamic and turbulent environment without sacrificing cost effectiveness, product quality and on-time delivery has become a key issue for most industrial enterprises. A promising approach to cope with this challenge is the integration of cognitive capabilities in systems and processes with the aim of expanding the knowledge base used to perform managerial and operational tasks. In this work, a novel approach to real-time rescheduling is proposed in order to achieve sustainable improvements in flexibility and adaptability of production systems through the integration of artificial cognitive capabilities, involving perception, reasoning/learning and planning skills. Moreover, an industrial example is discussed where the SOAR cognitive architecture capabilities are integrated in a software prototype, showing that the approach enables the rescheduling system to respond to events in an autonomic way, and to acquire experience through intensive simulation while performing repair tasks.