Clean Space Days

Name: Clean Space Days
Start: 2024-10-08T08:00:00+02:00
End: 2024-10-11T20:00:00+02:00
Location: No location set

8–11 Oct 2024

Europe/Amsterdam timezone

Contact

cleanspace@esa.int

Trajectory Optimization for Active Debris Removal: a Transformer-based Reinforcement Learning Approach

9 Oct 2024, 18:55

20m

Tennis Hall (Escape)

Tennis Hall

Escape

Student Poster Session Poster Session

Mr Federico Delrio (Politecnico di Torino) Paolo Cirrincione Pazé

One of the main challenges that satellites face is the progressive accumulation of debris in LEO. Hence, the necessity to develop new strategies for debris removal, as well as for servicing and refuelling existing satellites to increase their lifespan.
This article proposes an implementation of a Deep Reinforcement Learning (DRL) framework to optimize the path of a chaser satellite, tasked with retrieving space debris or servicing other spacecrafts. The experiments have been conducted in a simulated environment and in the presence of multiple space debris.
The proposed approach addresses imperfect environmental modelling and measurements by using a Partially Observable Markov Decision Process (POMDP). It replaces hidden state information with a belief function derived from the observation history, which is processed by a Long Short-Term Memory (LSTM) to create a fixed-length sequence. This sequence is then weighted by a Transformer encoder to capture the non-linear dynamics of the signals. The resulting semantic history is used by an agent employing Proximal Policy Optimization (PPO), an online direct policy estimation method. PPO relies on two neural networks: a critic for value estimation and an actor for policy evaluation, implemented as either Multi-Layer Perceptrons (MLPs) or 1D-Convolutional Neural Networks (CNNs) to leverage temporal information.
The model considers the motion of the satellite and debris in LEO, under J2 and atmospheric drag effects. The reward function has been designed to achieve rendezvous with the debris, minimum fuel consumption and manoeuvre duration, and optimal relative velocity.
Case studies, making use of available debris tracking data, are presented to demonstrate the efficacy of Transformer-based DRL in improving the precision, efficiency, and safety of ADR and IOS missions. The article concludes with a discussion on the future potential of DRL in advancing autonomous space operations and ensuring long-term space sustainability.

The architecture

Paolo Cirrincione Pazé

Mr Federico Delrio (Politecnico di Torino)

CSD 2024 Student Poster Session Cirrincione-Delrio-Sopegno.pdf

Clean Space Days

Contact

Trajectory Optimization for Active Debris Removal: a Transformer-based Reinforcement Learning Approach

Tennis Hall

Escape

Speakers

Description

Author

Co-author

Presentation materials

Choose timezone

Clean Space Days

Contact

Speakers

Description

Author

Co-author

Presentation materials