CRIU - User contributions [en]

Google Summer of Code Ideas

2026-03-02T13:59:36Z

Amikhalitsyn: /* Project ideas */ COW pages dump optimization

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.

This page contains project ideas for upcoming Google Summer of Code.

== Contact ==

First, make sure to go through the [[GSoC Students Recommendations]]. Once you build CRIU locally and C/R a simple process successfully, please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@lists.linux.dev mailing list] or write in [https://gitter.im/save-restore/criu gitter].

== Project ideas ==

=== Kubernetes Operator for Automated Checkpointing ===

'''Summary:''' Extend the Checkpoint/Restore Operator with support for automated policy-based checkpointing.

The [https://github.com/checkpoint-restore/checkpoint-restore-operator Checkpoint/Restore Operator] for Kubernetes currently supports only policies and parameters that limit the number of checkpoints. This project aims to extend the current support with automated policy-based checkpointing, allowing users to define triggers for checkpoint creation, such as time-based schedules, resource thresholds (CPU, memory, I/O usage), Kubernetes events (node drain, pod eviction, preemption), and application-level signals or annotations.

'''Links:'''
* https://github.com/checkpoint-restore/checkpoint-restore-operator
* https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Viktória Spišaková <spisakova@ics.muni.cz>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Forensic Checkpointing Framework for Kubernetes ===

Kubernetes provides a highly dynamic and ephemeral environment where workloads can start and disappear very quickly and are continuously being rescheduled across different nodes in the cluster.
One of the key challenges with forensic investigations in Kubernetes is capturing and preserving the evidence during security incidents. This project aims to address this problem by developing a framework for efficiently capturing and preserving the state of all running applications in a container at a specific point in time, along with the associated container configurations and metadata. These artifacts would allow investigators to accurately reconstruct the events, create a timeline, and analyze security incidents without impacting the running cluster. This is an important step towards enabling forensic readiness for Kubernetes, where cluster administrators proactively ensure the environments are prepared to collect and preserve evidence before a security incident occurs.

'''Links:'''
* https://github.com/checkpoint-restore/checkpointctl
* [https://fosdem.org/2026/events/attachments/F9RANH-forensic-snapshots-in-kubernetes/slides/267371/fosdem_2_4dh73ni.pdf Investigating Security Incidents with Forensic Snapshots in Kubernetes]
* [https://www.cncf.io/reports/cloud-native-security-whitepaper/ Cloud Native Security Whitepaper]
* [https://media.defense.gov/2022/Aug/29/2003066362/-1/-1/0/CTR_KUBERNETES_HARDENING_GUIDANCE_1.2_20220829.PDF Kubernetes Hardening Guide]

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Lorena Goldoni <lory.goldoni@gmail.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Enabling Checkpoint/Restore of Rootless Containers ===

[https://rootlesscontaine.rs/ Rootless containers] are containers that can be created, run, and managed by unprivileged users. Container engines such as Podman natively support running containers in a rootless mode to improve security and usability. While checkpoint/restore functionality is already available for rootful containers and unprivileged checkpointing is possible with the <code>CAP_CHECKPOINT_RESTORE</code> capability, container engines do not yet support native checkpointing of containers running in rootless mode. This project aims to explore and address the remaining challenges required to enable unprivileged checkpoint/restore for rootless containers.

'''Links:'''
* https://github.com/checkpoint-restore/criu/pull/1930
* https://github.com/torvalds/linux/commit/124ea650d3072b005457faed69909221c2905a1f
* https://src.fedoraproject.org/rpms/criu/pull-request/10#request_diff

'''Details:'''
* Skill level: intermediate
* Language: C, Go
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Files on detached mounts ===

'''Summary:''' Initial support of open files on "detached" mounts

When criu dumps a process with an open fd on a file, it gets the mount identifier (mnt_id) via /proc/<pid>/fdinfo/<fd>, so that criu knows from which exact mount the file was initially opened. This way criu can restore this fd by opening the same exact file from topologically the same mount in restored mount tree.

Restoring fd from the right mount can be important in different cases, for instance if the process would later want to resolve paths relative to the fd, and obviously resolving from the same file on different mount can lead to different resolved paths, or if the process wants to check path to the file via /proc/<pid>/fd/<fd>.

But we have a problem finding on which mount we need to reopen the file at restore if we only know mnt_id but can't find this mnt_id in /proc/<pid>/mountinfo.

Mountinfo file shows the mount tree topology of current mntns: parent - child relations, sharing group information, mountpoint and fs root information. And if we don't see mnt_id in it we don't know anything about this mount.

This can happen in two cases

* 1) external mount or file - if file was opened from e.g. host it's mount would not be visible in container mountinfo
* 2) mount was lazily unmounted

In case of 1) we have criu options to help criu handle external dependencies.

In case of 2) or no options provided criu can't resolve mnt_id in mountinfo and criu fails.

'''Solution:'''
We can handle 2) with: resolving major/minor via fstat, using name_to_handle_at and open_by_handle_at to open same file on any other available mount from same superblock (same major/minor) in container. Now we have fd2 of the same file as fd, but on existing mount we can dump it as usual instead, and mark it as "detached" in image, now criu on restore knows where to find this file, but instead of just opening fd2 from actually restored mount, we create a temporary bindmount which is lazy unmounted just after open making the file appear as a file on detached mount.

Known problems with this approach:

* Stat on btrfs gives wrong major/minor
* file handles does not work everywhere
* file handles can return fd2 on deleted file or on other hardlink, this needs special handling.

Additionally (optional part):
We can export real major/minor in fdinfo (kernel).
We can think of new kernel interface to get mount's major/minor and root (shift from fsroot) for detached mounts, if we have it we don't need file handle hack to find file on other mount (see fsinfo or getvalues kernel patches in LKML, can we add this info there?).

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Checkpointing of POSIX message queues ===

'''Summary:''' Add support for checkpoint/restore of POSIX message queues

POSIX message queues are a widely used inter-process communication mechanism. Message queues are implemented as files on a virtual filesystem (mqueue), where a file descriptor (message queue descriptor) is used to perform operations such as sending or receiving messages. To support checkpoint/restore of POSIX message queues, we need a kernel interface (similar to [https://github.com/checkpoint-restore/criu/commit/8ce9e947051e43430eb2ff06b96dddeba467b4fd MSG_PEEK]) that would enable the retrieval of messages from a queue without removing them. This project aims to implement such an interface that allows retrieving all messages and their priorities from a POSIX message queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/2285
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ipc/mqueue.c
* https://www.man7.org/tlpi/download/TLPI-52-POSIX_Message_Queues.pdf

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Add support for SCM_CREDENTIALS / SCM_PIDFD and friends ===

'''Summary:''' Support for SCM_CREDENTIALS / SCM_PIDFD

SCM_CREDENTIALS and SCM_PIDFD are types of SCM (Socket-level Control Messages). They play a crucial role
in systemd and many other user space applications. This project is about adding support for these
SCMs to be properly saved and restored back with CRIU. There is an existing code in OpenVZ CRIU fork,
see [1] and [2]. Goal would be first of all to properly port this code, cover with extensive tests and
ensure that SCM_PIDFD / SO_PEERPIDFD are handled correctly. Also we expect to cover things like
SO_PASSRIGHTS and SO_PASSPIDFD.

There is some extra source of complexity here pidfds can be "stale" (see PIDFD_STALE in Linux kernel)
and we need to ensure that we properly cover those cases.

'''Links:'''
* [1] openvz-criu https://bitbucket.org/openvz/criu.ovz/history-node/918653a0a343194385592d7b50b5bd7a8fbe1cc1/criu/sk-unix.c?at=hci-dev
* [2] openvz-criu https://bitbucket.org/openvz/criu.ovz/history-node/918653a0a343194385592d7b50b5bd7a8fbe1cc1/criu/sk-queue.c?at=hci-dev
* [3] Linux kernel https://github.com/torvalds/linux/commit/5e2ff6704a275be009be8979af17c52361b79b89
* [4] Linux kernel https://github.com/torvalds/linux/commit/c679d17d3f2d895b34e660673141ad250889831f

'''Details:'''
* Skill level: intermediate / advanced
* Language: C
* Expected size: 350 hours
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Integrate with Live Update Orchestrator (LUO) ===

'''Summary:''' Integrate with Live Update Orchestrator (LUO)

Live Update Orchestrator (LUO) is a framework for Linux kernel
live updates (via kexec). Idea behind it is to provide kernel
and user space API to save specific system resources across
kexec reboot.

This research project explores how CRIU can be integrated with LUO.
For example, if a user is running memcached on a node, the current
approach would require a full CRIU dump, then saving the entire
process memory to disk, then followed by restoring it after the
kernel live update.

Instead, CRIU could be extended to leverage the LUO API. When instructed,
it could preserve selected memory regions directly across the kexec reboot,
avoiding a full disk dump and significantly accelerating the restore process
after the kernel update.

'''Links:'''
* [1] LUO kernel documentation https://docs.kernel.org/core-api/liveupdate.html
* [2] LUO memfd doc https://docs.kernel.org/mm/memfd_preservation.html

'''Details:'''
* Skill level: intermediate / advanced
* Language: C
* Expected size: 350 hours
* Suggested by: Andrei Vagin <avagin@gmail.com>
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Optimize COW memory dumping ===

'''Summary:''' Optimize COW memory dumping

The Linux kernel memory management subsystem is highly optimized not only for performance, but also to minimize unnecessary memory consumption. A key example of this is how the kernel handles private VMAs when user space invokes the fork() system call.

Rather than duplicating the entire VMA tree along with all memory contents, the kernel creates optimized copies of inherited VMAs using the Copy-on-Write (COW) mechanism. When a process writes to a page within a COW-ed VMA, a write page fault occurs, and the kernel creates a private copy of that page before applying the modification. However, if the page is only read, no copying is performed.

This approach significantly improves fork() performance and can dramatically reduce memory usage in many workloads.

In CRIU, when dumping VMAs and their associated memory pages, this COW optimization is not currently taken into account during the dump phase. As a result, for COW-backed VMAs, CRIU may generate multiple copies of identical memory pages in the dump image.

During restore, however, CRIU explicitly handles this situation (see [1] and [2]) and attempts to reconstruct COW relationships inside the kernel. This step is critical: without it, a checkpoint/restore (C/R) cycle could lead to a substantial increase in memory consumption for the same process tree. For example, a workload that originally consumed 500 MiB could expand to 800 MiB after restore, which is clearly unacceptable.

This project aims to improve the dumping algorithm so that it avoids producing multiple unnecessary copies of identical pages belonging to COW-ed VMAs.

The project requires some understanding of Linux memory management internals and CRIU’s architecture. We strongly encourage GSoC contributors to study references [1] and [2] and experiment with the relevant code paths before applying. We are happy to answer questions and provide guidance along the way.

'''Links:'''
* [1] preparing COW VMAs https://github.com/checkpoint-restore/criu/blob/c180188db036f8ea4c08bfee28cbcdbdd52cdfc3/criu/mem.c#L878
* [2] private vma content restore cow case https://github.com/checkpoint-restore/criu/blob/c180188db036f8ea4c08bfee28cbcdbdd52cdfc3/criu/mem.c#L1219

'''Details:'''
* Skill level: intermediate / advanced
* Language: C
* Expected size: 350 hours
* Suggested by: Andrei Vagin <avagin@gmail.com>
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>

== Suspended project ideas ==

Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.

=== Optimize logging engine ===

'''Summary:''' CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but ''if'' some operation fails -- the logs are the only way to find what was the reason for failure.

At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.

One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.

The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each <code>log(fmt, ...)</code> call gets translated into a call to a binary log function that saves <code>fmt</code> identifier copies all the args ''as is'' into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.

'''Links:'''
* [[Better logging]]

'''Details:'''
* Skill level: intermediate
* Language: C, though decoder/preprocessor can be in any language
* Expected size: 350 hours
* Suggested by: Andrei Vagin
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== IOUring support ===
The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.

'''Links:'''
* https://blogs.oracle.com/linux/an-introduction-to-the-io_uring-asynchronous-io-framework
* https://github.com/axboe/liburing

'''Details:'''
* Skill level: expert (+linux kernel)
* Expected size: 350 hours

=== Add support for SPFS ===

'''Summary:''' The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE

NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.

'''Links'''
* https://github.com/checkpoint-restore/criu/issues/60
* https://github.com/checkpoint-restore/criu/issues/53
* https://github.com/skinsbursky/spfs
* https://patchwork.criu.org/series/137/

'''Details:'''
* Skill level: expert
* Language: C
* Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Anonymise image files ===

'''Summary:''' Teach [[CRIT]] to remove sensitive information from images

When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.

List of data to shred:

* Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
* Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
* Registers.
* Process names. (But relations should be kept).
* Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
* Ghost files.
* Tarballs with tmpfs-s.
* IP addresses in sk-inet-s, ip tool dumps and net*.img.

'''Links:'''
* [[Anonymize image files]]
* https://github.com/checkpoint-restore/criu/issues/360
* [[CRIT]], [[Images]]
* External links to mailing lists or web sites

'''Details:'''
* Skill level: beginner
* Language: Python

=== Add support for checkpoint/restore of CORK-ed UDP socket ===

'''Summary:''' Support C/R of corked UDP socket

There's UDP_CORK option for sockets. As man page says:
<pre>
If this option is enabled, then all data output on this socket
is accumulated into a single datagram that is transmitted when
the option is disabled. This option should not be used in
code intended to be portable.
</pre>

Currently criu refuses to dump this case, so it's effectively a bug. Supporting
this will need extending the kernel API to allow criu read back the write queue
of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then
the queue is written into the image and is restored into the socket (with the CORK
bit set too).

'''Notes:'''

We already had a couple (3) of tries for this problem:

* UDP_REPAIR approach didn't succeed: https://lore.kernel.org/netdev/721a2e32-c930-ad6b-5055-631b502ed11b@gmail.com/, https://lore.kernel.org/netdev/?q=udp_repair
* eBPF (CRIB) approach, socket queue iterator was not merged: https://lore.kernel.org/netdev/AM6PR03MB5848EDA002E3D7EACA7C6BDA99A52@AM6PR03MB5848.eurprd03.prod.outlook.com/, and we have general objections to CRIB approach https://lore.kernel.org/bpf/CAHk-=wjLWFa3i6+Tab67gnNumTYipj_HuheXr2RCq4zn0tCTzA@mail.gmail.com/

We still have one idea we didn't try, as UDP allows packets to be lost on the way on restore we can somehow mark the socket to drop all data before UNCORK. This way we don't really need to restore contents of UDP CORK-ed sockets send queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/409
* https://github.com/criupatchwork/criu/commit/a532312
* [[Sockets]], [[TCP connection]]
* [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]]

'''Details:'''
* Skill level: intermediate (+linux kernel)
* Language: C
* Expected size: 350 hours
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Andrei Vagin <avagin@gmail.com>

[[Category:GSoC]]
[[Category:Development]]

Google Summer of Code Ideas

2026-03-02T13:20:03Z

Amikhalitsyn: /* Project ideas */ Add LUO integration

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.

This page contains project ideas for upcoming Google Summer of Code.

== Contact ==

First, make sure to go through the [[GSoC Students Recommendations]]. Once you build CRIU locally and C/R a simple process successfully, please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@lists.linux.dev mailing list] or write in [https://gitter.im/save-restore/criu gitter].

== Project ideas ==

=== Kubernetes Operator for Automated Checkpointing ===

'''Summary:''' Extend the Checkpoint/Restore Operator with support for automated policy-based checkpointing.

The [https://github.com/checkpoint-restore/checkpoint-restore-operator Checkpoint/Restore Operator] for Kubernetes currently supports only policies and parameters that limit the number of checkpoints. This project aims to extend the current support with automated policy-based checkpointing, allowing users to define triggers for checkpoint creation, such as time-based schedules, resource thresholds (CPU, memory, I/O usage), Kubernetes events (node drain, pod eviction, preemption), and application-level signals or annotations.

'''Links:'''
* https://github.com/checkpoint-restore/checkpoint-restore-operator
* https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Viktória Spišaková <spisakova@ics.muni.cz>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Forensic Checkpointing Framework for Kubernetes ===

Kubernetes provides a highly dynamic and ephemeral environment where workloads can start and disappear very quickly and are continuously being rescheduled across different nodes in the cluster.
One of the key challenges with forensic investigations in Kubernetes is capturing and preserving the evidence during security incidents. This project aims to address this problem by developing a framework for efficiently capturing and preserving the state of all running applications in a container at a specific point in time, along with the associated container configurations and metadata. These artifacts would allow investigators to accurately reconstruct the events, create a timeline, and analyze security incidents without impacting the running cluster. This is an important step towards enabling forensic readiness for Kubernetes, where cluster administrators proactively ensure the environments are prepared to collect and preserve evidence before a security incident occurs.

'''Links:'''
* https://github.com/checkpoint-restore/checkpointctl
* [https://fosdem.org/2026/events/attachments/F9RANH-forensic-snapshots-in-kubernetes/slides/267371/fosdem_2_4dh73ni.pdf Investigating Security Incidents with Forensic Snapshots in Kubernetes]
* [https://www.cncf.io/reports/cloud-native-security-whitepaper/ Cloud Native Security Whitepaper]
* [https://media.defense.gov/2022/Aug/29/2003066362/-1/-1/0/CTR_KUBERNETES_HARDENING_GUIDANCE_1.2_20220829.PDF Kubernetes Hardening Guide]

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Lorena Goldoni <lory.goldoni@gmail.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Enabling Checkpoint/Restore of Rootless Containers ===

[https://rootlesscontaine.rs/ Rootless containers] are containers that can be created, run, and managed by unprivileged users. Container engines such as Podman natively support running containers in a rootless mode to improve security and usability. While checkpoint/restore functionality is already available for rootful containers and unprivileged checkpointing is possible with the <code>CAP_CHECKPOINT_RESTORE</code> capability, container engines do not yet support native checkpointing of containers running in rootless mode. This project aims to explore and address the remaining challenges required to enable unprivileged checkpoint/restore for rootless containers.

'''Links:'''
* https://github.com/checkpoint-restore/criu/pull/1930
* https://github.com/torvalds/linux/commit/124ea650d3072b005457faed69909221c2905a1f
* https://src.fedoraproject.org/rpms/criu/pull-request/10#request_diff

'''Details:'''
* Skill level: intermediate
* Language: C, Go
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Files on detached mounts ===

'''Summary:''' Initial support of open files on "detached" mounts

When criu dumps a process with an open fd on a file, it gets the mount identifier (mnt_id) via /proc/<pid>/fdinfo/<fd>, so that criu knows from which exact mount the file was initially opened. This way criu can restore this fd by opening the same exact file from topologically the same mount in restored mount tree.

Restoring fd from the right mount can be important in different cases, for instance if the process would later want to resolve paths relative to the fd, and obviously resolving from the same file on different mount can lead to different resolved paths, or if the process wants to check path to the file via /proc/<pid>/fd/<fd>.

But we have a problem finding on which mount we need to reopen the file at restore if we only know mnt_id but can't find this mnt_id in /proc/<pid>/mountinfo.

Mountinfo file shows the mount tree topology of current mntns: parent - child relations, sharing group information, mountpoint and fs root information. And if we don't see mnt_id in it we don't know anything about this mount.

This can happen in two cases

* 1) external mount or file - if file was opened from e.g. host it's mount would not be visible in container mountinfo
* 2) mount was lazily unmounted

In case of 1) we have criu options to help criu handle external dependencies.

In case of 2) or no options provided criu can't resolve mnt_id in mountinfo and criu fails.

'''Solution:'''
We can handle 2) with: resolving major/minor via fstat, using name_to_handle_at and open_by_handle_at to open same file on any other available mount from same superblock (same major/minor) in container. Now we have fd2 of the same file as fd, but on existing mount we can dump it as usual instead, and mark it as "detached" in image, now criu on restore knows where to find this file, but instead of just opening fd2 from actually restored mount, we create a temporary bindmount which is lazy unmounted just after open making the file appear as a file on detached mount.

Known problems with this approach:

* Stat on btrfs gives wrong major/minor
* file handles does not work everywhere
* file handles can return fd2 on deleted file or on other hardlink, this needs special handling.

Additionally (optional part):
We can export real major/minor in fdinfo (kernel).
We can think of new kernel interface to get mount's major/minor and root (shift from fsroot) for detached mounts, if we have it we don't need file handle hack to find file on other mount (see fsinfo or getvalues kernel patches in LKML, can we add this info there?).

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Checkpointing of POSIX message queues ===

'''Summary:''' Add support for checkpoint/restore of POSIX message queues

POSIX message queues are a widely used inter-process communication mechanism. Message queues are implemented as files on a virtual filesystem (mqueue), where a file descriptor (message queue descriptor) is used to perform operations such as sending or receiving messages. To support checkpoint/restore of POSIX message queues, we need a kernel interface (similar to [https://github.com/checkpoint-restore/criu/commit/8ce9e947051e43430eb2ff06b96dddeba467b4fd MSG_PEEK]) that would enable the retrieval of messages from a queue without removing them. This project aims to implement such an interface that allows retrieving all messages and their priorities from a POSIX message queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/2285
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ipc/mqueue.c
* https://www.man7.org/tlpi/download/TLPI-52-POSIX_Message_Queues.pdf

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Add support for SCM_CREDENTIALS / SCM_PIDFD and friends ===

'''Summary:''' Support for SCM_CREDENTIALS / SCM_PIDFD

SCM_CREDENTIALS and SCM_PIDFD are types of SCM (Socket-level Control Messages). They play a crucial role
in systemd and many other user space applications. This project is about adding support for these
SCMs to be properly saved and restored back with CRIU. There is an existing code in OpenVZ CRIU fork,
see [1] and [2]. Goal would be first of all to properly port this code, cover with extensive tests and
ensure that SCM_PIDFD / SO_PEERPIDFD are handled correctly. Also we expect to cover things like
SO_PASSRIGHTS and SO_PASSPIDFD.

There is some extra source of complexity here pidfds can be "stale" (see PIDFD_STALE in Linux kernel)
and we need to ensure that we properly cover those cases.

'''Links:'''
* [1] openvz-criu https://bitbucket.org/openvz/criu.ovz/history-node/918653a0a343194385592d7b50b5bd7a8fbe1cc1/criu/sk-unix.c?at=hci-dev
* [2] openvz-criu https://bitbucket.org/openvz/criu.ovz/history-node/918653a0a343194385592d7b50b5bd7a8fbe1cc1/criu/sk-queue.c?at=hci-dev
* [3] Linux kernel https://github.com/torvalds/linux/commit/5e2ff6704a275be009be8979af17c52361b79b89
* [4] Linux kernel https://github.com/torvalds/linux/commit/c679d17d3f2d895b34e660673141ad250889831f

'''Details:'''
* Skill level: intermediate / advanced
* Language: C
* Expected size: 350 hours
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Integrate with Live Update Orchestrator (LUO) ===

'''Summary:''' Integrate with Live Update Orchestrator (LUO)

Live Update Orchestrator (LUO) is a framework for Linux kernel
live updates (via kexec). Idea behind it is to provide kernel
and user space API to save specific system resources across
kexec reboot.

This research project explores how CRIU can be integrated with LUO.
For example, if a user is running memcached on a node, the current
approach would require a full CRIU dump, then saving the entire
process memory to disk, then followed by restoring it after the
kernel live update.

Instead, CRIU could be extended to leverage the LUO API. When instructed,
it could preserve selected memory regions directly across the kexec reboot,
avoiding a full disk dump and significantly accelerating the restore process
after the kernel update.

'''Links:'''
* [1] LUO kernel documentation https://docs.kernel.org/core-api/liveupdate.html
* [2] LUO memfd doc https://docs.kernel.org/mm/memfd_preservation.html

'''Details:'''
* Skill level: intermediate / advanced
* Language: C
* Expected size: 350 hours
* Suggested by: Andrei Vagin <avagin@gmail.com>
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>

== Suspended project ideas ==

Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.

=== Optimize logging engine ===

'''Summary:''' CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but ''if'' some operation fails -- the logs are the only way to find what was the reason for failure.

At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.

One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.

The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each <code>log(fmt, ...)</code> call gets translated into a call to a binary log function that saves <code>fmt</code> identifier copies all the args ''as is'' into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.

'''Links:'''
* [[Better logging]]

'''Details:'''
* Skill level: intermediate
* Language: C, though decoder/preprocessor can be in any language
* Expected size: 350 hours
* Suggested by: Andrei Vagin
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== IOUring support ===
The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.

'''Links:'''
* https://blogs.oracle.com/linux/an-introduction-to-the-io_uring-asynchronous-io-framework
* https://github.com/axboe/liburing

'''Details:'''
* Skill level: expert (+linux kernel)
* Expected size: 350 hours

=== Add support for SPFS ===

'''Summary:''' The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE

NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.

'''Links'''
* https://github.com/checkpoint-restore/criu/issues/60
* https://github.com/checkpoint-restore/criu/issues/53
* https://github.com/skinsbursky/spfs
* https://patchwork.criu.org/series/137/

'''Details:'''
* Skill level: expert
* Language: C
* Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Anonymise image files ===

'''Summary:''' Teach [[CRIT]] to remove sensitive information from images

When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.

List of data to shred:

* Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
* Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
* Registers.
* Process names. (But relations should be kept).
* Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
* Ghost files.
* Tarballs with tmpfs-s.
* IP addresses in sk-inet-s, ip tool dumps and net*.img.

'''Links:'''
* [[Anonymize image files]]
* https://github.com/checkpoint-restore/criu/issues/360
* [[CRIT]], [[Images]]
* External links to mailing lists or web sites

'''Details:'''
* Skill level: beginner
* Language: Python

=== Add support for checkpoint/restore of CORK-ed UDP socket ===

'''Summary:''' Support C/R of corked UDP socket

There's UDP_CORK option for sockets. As man page says:
<pre>
If this option is enabled, then all data output on this socket
is accumulated into a single datagram that is transmitted when
the option is disabled. This option should not be used in
code intended to be portable.
</pre>

Currently criu refuses to dump this case, so it's effectively a bug. Supporting
this will need extending the kernel API to allow criu read back the write queue
of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then
the queue is written into the image and is restored into the socket (with the CORK
bit set too).

'''Notes:'''

We already had a couple (3) of tries for this problem:

* UDP_REPAIR approach didn't succeed: https://lore.kernel.org/netdev/721a2e32-c930-ad6b-5055-631b502ed11b@gmail.com/, https://lore.kernel.org/netdev/?q=udp_repair
* eBPF (CRIB) approach, socket queue iterator was not merged: https://lore.kernel.org/netdev/AM6PR03MB5848EDA002E3D7EACA7C6BDA99A52@AM6PR03MB5848.eurprd03.prod.outlook.com/, and we have general objections to CRIB approach https://lore.kernel.org/bpf/CAHk-=wjLWFa3i6+Tab67gnNumTYipj_HuheXr2RCq4zn0tCTzA@mail.gmail.com/

We still have one idea we didn't try, as UDP allows packets to be lost on the way on restore we can somehow mark the socket to drop all data before UNCORK. This way we don't really need to restore contents of UDP CORK-ed sockets send queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/409
* https://github.com/criupatchwork/criu/commit/a532312
* [[Sockets]], [[TCP connection]]
* [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]]

'''Details:'''
* Skill level: intermediate (+linux kernel)
* Language: C
* Expected size: 350 hours
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Andrei Vagin <avagin@gmail.com>

[[Category:GSoC]]
[[Category:Development]]

Google Summer of Code Ideas

2026-02-09T22:17:09Z

Amikhalitsyn: /* Project ideas */ add SCM_CREDENTIALS / SCM_PIDFD project idea

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.

This page contains project ideas for upcoming Google Summer of Code.

== Contact ==

First, make sure to go through the [[GSoC Students Recommendations]]. Once you build CRIU locally and C/R a simple process successfully, please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@lists.linux.dev mailing list] or write in [https://gitter.im/save-restore/criu gitter].

== Project ideas ==

=== Kubernetes Operator for Automated Checkpointing ===

'''Summary:''' Extend the Checkpoint/Restore Operator with support for automated policy-based checkpointing.

The [https://github.com/checkpoint-restore/checkpoint-restore-operator Checkpoint/Restore Operator] for Kubernetes currently supports only policies and parameters that limit the number of checkpoints. This project aims to extend the current support with automated policy-based checkpointing, allowing users to define triggers for checkpoint creation, such as time-based schedules, resource thresholds (CPU, memory, I/O usage), Kubernetes events (node drain, pod eviction, preemption), and application-level signals or annotations.

'''Links:'''
* https://github.com/checkpoint-restore/checkpoint-restore-operator
* https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Viktória Spišaková <spisakova@ics.muni.cz>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Forensic Checkpointing Framework for Kubernetes ===

Kubernetes provides a highly dynamic and ephemeral environment where workloads can start and disappear very quickly and are continuously being rescheduled across different nodes in the cluster.
One of the key challenges with forensic investigations in Kubernetes is capturing and preserving the evidence during security incidents. This project aims to address this problem by developing a framework for efficiently capturing and preserving the state of all running applications in a container at a specific point in time, along with the associated container configurations and metadata. These artifacts would allow investigators to accurately reconstruct the events, create a timeline, and analyze security incidents without impacting the running cluster. This is an important step towards enabling forensic readiness for Kubernetes, where cluster administrators proactively ensure the environments are prepared to collect and preserve evidence before a security incident occurs.

'''Links:'''
* https://github.com/checkpoint-restore/checkpointctl
* [https://fosdem.org/2026/events/attachments/F9RANH-forensic-snapshots-in-kubernetes/slides/266249/fosdem_2_4dh73ni.pdf Investigating Security Incidents with Forensic Snapshots in Kubernetes]
* [https://www.cncf.io/reports/cloud-native-security-whitepaper/ Cloud Native Security Whitepaper]
* [https://media.defense.gov/2022/Aug/29/2003066362/-1/-1/0/CTR_KUBERNETES_HARDENING_GUIDANCE_1.2_20220829.PDF Kubernetes Hardening Guide]

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Lorena Goldoni <lory.goldoni@gmail.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Enabling Checkpoint/Restore of Rootless Containers ===

[https://rootlesscontaine.rs/ Rootless containers] are containers that can be created, run, and managed by unprivileged users. Container engines such as Podman natively support running containers in a rootless mode to improve security and usability. While checkpoint/restore functionality is already available for rootful containers and unprivileged checkpointing is possible with the <code>CAP_CHECKPOINT_RESTORE</code> capability, container engines do not yet support native checkpointing of containers running in rootless mode. This project aims to explore and address the remaining challenges required to enable unprivileged checkpoint/restore for rootless containers.

'''Links:'''
* https://github.com/checkpoint-restore/criu/pull/1930
* https://github.com/torvalds/linux/commit/124ea650d3072b005457faed69909221c2905a1f
* https://src.fedoraproject.org/rpms/criu/pull-request/10#request_diff

'''Details:'''
* Skill level: intermediate
* Language: C, Go
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Adrian Reber <areber@redhat.com>

=== Add support for memory compression ===

'''Summary:''' Support compression for page images

We would like to support memory page files compression
in CRIU using one of the fastest algorithms (it's matter
of discussion which one to choose!).

This task does not require any Linux kernel modifications
and scope is limited to CRIU itself. At the same time it's
complex enough as we need to touch memory dump/restore codepath
in CRIU and also handle many corner cases like page-server and stuff.

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Suggested by: Andrei Vagin <avagin@gmail.com>
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Alexander Mikhalitsyn <alexander@mihalicyn.com>, Andrei Vagin <avagin@gmail.com>

=== Files on detached mounts ===

'''Summary:''' Initial support of open files on "detached" mounts

When criu dumps a process with an open fd on a file, it gets the mount identifier (mnt_id) via /proc/<pid>/fdinfo/<fd>, so that criu knows from which exact mount the file was initially opened. This way criu can restore this fd by opening the same exact file from topologically the same mount in restored mount tree.

Restoring fd from the right mount can be important in different cases, for instance if the process would later want to resolve paths relative to the fd, and obviously resolving from the same file on different mount can lead to different resolved paths, or if the process wants to check path to the file via /proc/<pid>/fd/<fd>.

But we have a problem finding on which mount we need to reopen the file at restore if we only know mnt_id but can't find this mnt_id in /proc/<pid>/mountinfo.

Mountinfo file shows the mount tree topology of current mntns: parent - child relations, sharing group information, mountpoint and fs root information. And if we don't see mnt_id in it we don't know anything about this mount.

This can happen in two cases

* 1) external mount or file - if file was opened from e.g. host it's mount would not be visible in container mountinfo
* 2) mount was lazily unmounted

In case of 1) we have criu options to help criu handle external dependencies.

In case of 2) or no options provided criu can't resolve mnt_id in mountinfo and criu fails.

'''Solution:'''
We can handle 2) with: resolving major/minor via fstat, using name_to_handle_at and open_by_handle_at to open same file on any other available mount from same superblock (same major/minor) in container. Now we have fd2 of the same file as fd, but on existing mount we can dump it as usual instead, and mark it as "detached" in image, now criu on restore knows where to find this file, but instead of just opening fd2 from actually restored mount, we create a temporary bindmount which is lazy unmounted just after open making the file appear as a file on detached mount.

Known problems with this approach:

* Stat on btrfs gives wrong major/minor
* file handles does not work everywhere
* file handles can return fd2 on deleted file or on other hardlink, this needs special handling.

Additionally (optional part):
We can export real major/minor in fdinfo (kernel).
We can think of new kernel interface to get mount's major/minor and root (shift from fsroot) for detached mounts, if we have it we don't need file handle hack to find file on other mount (see fsinfo or getvalues kernel patches in LKML, can we add this info there?).

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Checkpointing of POSIX message queues ===

'''Summary:''' Add support for checkpoint/restore of POSIX message queues

POSIX message queues are a widely used inter-process communication mechanism. Message queues are implemented as files on a virtual filesystem (mqueue), where a file descriptor (message queue descriptor) is used to perform operations such as sending or receiving messages. To support checkpoint/restore of POSIX message queues, we need a kernel interface (similar to [https://github.com/checkpoint-restore/criu/commit/8ce9e947051e43430eb2ff06b96dddeba467b4fd MSG_PEEK]) that would enable the retrieval of messages from a queue without removing them. This project aims to implement such an interface that allows retrieving all messages and their priorities from a POSIX message queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/2285
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ipc/mqueue.c
* https://www.man7.org/tlpi/download/TLPI-52-POSIX_Message_Queues.pdf

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Add support for SCM_CREDENTIALS / SCM_PIDFD and friends ===

'''Summary:''' Support for SCM_CREDENTIALS / SCM_PIDFD

SCM_CREDENTIALS and SCM_PIDFD are types of SCM (Socket-level Control Messages). They play a crucial role
in systemd and many other user space applications. This project is about adding support for these
SCMs to be properly saved and restored back with CRIU. There is an existing code in OpenVZ CRIU fork,
see [1] and [2]. Goal would be first of all to properly port this code, cover with extensive tests and
ensure that SCM_PIDFD / SO_PEERPIDFD are handled correctly. Also we expect to cover things like
SO_PASSRIGHTS and SO_PASSPIDFD.

There is some extra source of complexity here pidfds can be "stale" (see PIDFD_STALE in Linux kernel)
and we need to ensure that we properly cover those cases.

'''Links:'''
* [1] openvz-criu https://bitbucket.org/openvz/criu.ovz/history-node/918653a0a343194385592d7b50b5bd7a8fbe1cc1/criu/sk-unix.c?at=hci-dev
* [2] openvz-criu https://bitbucket.org/openvz/criu.ovz/history-node/918653a0a343194385592d7b50b5bd7a8fbe1cc1/criu/sk-queue.c?at=hci-dev
* [3] Linux kernel https://github.com/torvalds/linux/commit/5e2ff6704a275be009be8979af17c52361b79b89
* [4] Linux kernel https://github.com/torvalds/linux/commit/c679d17d3f2d895b34e660673141ad250889831f

'''Details:'''
* Skill level: intermediate / advanced
* Language: C
* Expected size: 350 hours
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Mentors: Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>

== Suspended project ideas ==

Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.

=== Optimize logging engine ===

'''Summary:''' CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but ''if'' some operation fails -- the logs are the only way to find what was the reason for failure.

At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.

One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.

The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each <code>log(fmt, ...)</code> call gets translated into a call to a binary log function that saves <code>fmt</code> identifier copies all the args ''as is'' into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.

'''Links:'''
* [[Better logging]]

'''Details:'''
* Skill level: intermediate
* Language: C, though decoder/preprocessor can be in any language
* Expected size: 350 hours
* Suggested by: Andrei Vagin
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== IOUring support ===
The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.

'''Links:'''
* https://blogs.oracle.com/linux/an-introduction-to-the-io_uring-asynchronous-io-framework
* https://github.com/axboe/liburing

'''Details:'''
* Skill level: expert (+linux kernel)
* Expected size: 350 hours

=== Add support for SPFS ===

'''Summary:''' The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE

NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.

'''Links'''
* https://github.com/checkpoint-restore/criu/issues/60
* https://github.com/checkpoint-restore/criu/issues/53
* https://github.com/skinsbursky/spfs
* https://patchwork.criu.org/series/137/

'''Details:'''
* Skill level: expert
* Language: C
* Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Anonymise image files ===

'''Summary:''' Teach [[CRIT]] to remove sensitive information from images

When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.

List of data to shred:

* Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
* Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
* Registers.
* Process names. (But relations should be kept).
* Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
* Ghost files.
* Tarballs with tmpfs-s.
* IP addresses in sk-inet-s, ip tool dumps and net*.img.

'''Links:'''
* [[Anonymize image files]]
* https://github.com/checkpoint-restore/criu/issues/360
* [[CRIT]], [[Images]]
* External links to mailing lists or web sites

'''Details:'''
* Skill level: beginner
* Language: Python

=== Add support for checkpoint/restore of CORK-ed UDP socket ===

'''Summary:''' Support C/R of corked UDP socket

There's UDP_CORK option for sockets. As man page says:
<pre>
If this option is enabled, then all data output on this socket
is accumulated into a single datagram that is transmitted when
the option is disabled. This option should not be used in
code intended to be portable.
</pre>

Currently criu refuses to dump this case, so it's effectively a bug. Supporting
this will need extending the kernel API to allow criu read back the write queue
of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then
the queue is written into the image and is restored into the socket (with the CORK
bit set too).

'''Notes:'''

We already had a couple (3) of tries for this problem:

* UDP_REPAIR approach didn't succeed: https://lore.kernel.org/netdev/721a2e32-c930-ad6b-5055-631b502ed11b@gmail.com/, https://lore.kernel.org/netdev/?q=udp_repair
* eBPF (CRIB) approach, socket queue iterator was not merged: https://lore.kernel.org/netdev/AM6PR03MB5848EDA002E3D7EACA7C6BDA99A52@AM6PR03MB5848.eurprd03.prod.outlook.com/, and we have general objections to CRIB approach https://lore.kernel.org/bpf/CAHk-=wjLWFa3i6+Tab67gnNumTYipj_HuheXr2RCq4zn0tCTzA@mail.gmail.com/

We still have one idea we didn't try, as UDP allows packets to be lost on the way on restore we can somehow mark the socket to drop all data before UNCORK. This way we don't really need to restore contents of UDP CORK-ed sockets send queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/409
* https://github.com/criupatchwork/criu/commit/a532312
* [[Sockets]], [[TCP connection]]
* [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]]

'''Details:'''
* Skill level: intermediate (+linux kernel)
* Language: C
* Expected size: 350 hours
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Andrei Vagin <avagin@gmail.com>

[[Category:GSoC]]
[[Category:Development]]

Arm64-GCS

2025-07-14T17:58:43Z

Amikhalitsyn:

TBD

[[Category:Under the hood]]

Arm64-GCS

2025-07-14T17:55:45Z

Amikhalitsyn: Created page with "TBD"

TBD

Google Summer of Code Ideas

2025-02-14T13:51:03Z

Amikhalitsyn: /* Project ideas */ add arm64 Guarded Control Stack (GCS) project

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.

This page contains project ideas for upcoming Google Summer of Code.

== Contacts ==

Please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@lists.linux.dev mailing list] or write in [https://gitter.im/save-restore/criu gitter].

== Project ideas ==

=== Add support for memory compression ===

'''Summary:''' Support compression for page images

We would like to support memory page files compression
in CRIU using one of the fastest algorithms (it's matter
of discussion which one to choose!).

This task does not require any Linux kernel modifications
and scope is limited to CRIU itself. At the same time it's
complex enough as we need to touch memory dump/restore codepath
in CRIU and also handle many corner cases like page-server and stuff.

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Suggested by: Andrei Vagin <avagin@gmail.com>
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Andrei Vagin <avagin@gmail.com>

=== Use eBPF to lock and unlock the network ===

'''Summary:''' Use eBPF instead of external iptables-restore tool for network lock and unlock.

During checkpointing and restoring CRIU locks the network to make sure no network packets are accepted by the network stack during the time the process is checkpointed. Currently CRIU calls out to iptables-restore to create and delete the corresponding iptables rules. Another approach which avoids calling out to the external binary iptables-restore would be to directly inject eBPF rules. There have been reports from users that iptables-restore fails in some way and eBPF could avoid this external dependency.

'''Links:'''
* https://www.criu.org/TCP_connection#Checkpoint_and_restore_TCP_connection
* https://github.com/systemd/systemd/blob/master/src/core/bpf-firewall.c
* https://blog.zeyady.com/2021-08-16/gsoc-criu

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Adrian Reber <areber@redhat.com>

=== Files on detached mounts ===

'''Summary:''' Initial support of open files on "detached" mounts

When criu dumps a process with an open fd on a file, it gets the mount identifier (mnt_id) via /proc/<pid>/fdinfo/<fd>, so that criu knows from which exact mount the file was initially opened. This way criu can restore this fd by opening the same exact file from topologically the same mount in restored mount tree.

Restoring fd from the right mount can be important in different cases, for instance if the process would later want to resolve paths relative to the fd, and obviously resolving from the same file on different mount can lead to different resolved paths, or if the process wants to check path to the file via /proc/<pid>/fd/<fd>.

But we have a problem finding on which mount we need to reopen the file at restore if we only know mnt_id but can't find this mnt_id in /proc/<pid>/mountinfo.

Mountinfo file shows the mount tree topology of current mntns: parent - child relations, sharing group information, mountpoint and fs root information. And if we don't see mnt_id in it we don't know anything about this mount.

This can happen in two cases

* 1) external mount or file - if file was opened from e.g. host it's mount would not be visible in container mountinfo
* 2) mount was lazily unmounted

In case of 1) we have criu options to help criu handle external dependencies.

In case of 2) or no options provided criu can't resolve mnt_id in mountinfo and criu fails.

'''Solution:'''
We can handle 2) with: resolving major/minor via fstat, using name_to_handle_at and open_by_handle_at to open same file on any other available mount from same superblock (same major/minor) in container. Now we have fd2 of the same file as fd, but on existing mount we can dump it as usual instead, and mark it as "detached" in image, now criu on restore knows where to find this file, but instead of just opening fd2 from actually restored mount, we create a temporary bindmount which is lazy unmounted just after open making the file appear as a file on detached mount.

Known problems with this approach:

* Stat on btrfs gives wrong major/minor
* file handles does not work everywhere
* file handles can return fd2 on deleted file or on other hardlink, this needs special handling.

Additionally (optional part):
We can export real major/minor in fdinfo (kernel).
We can think of new kernel interface to get mount's major/minor and root (shift from fsroot) for detached mounts, if we have it we don't need file handle hack to find file on other mount (see fsinfo or getvalues kernel patches in LKML, can we add this info there?).

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Checkpointing of POSIX message queues ===

'''Summary:''' Add support for checkpoint/restore of POSIX message queues

POSIX message queues are a widely used inter-process communication mechanism. Message queues are implemented as files on a virtual filesystem (mqueue), where a file descriptor (message queue descriptor) is used to perform operations such as sending or receiving messages. To support checkpoint/restore of POSIX message queues, we need a kernel interface (similar to [https://github.com/checkpoint-restore/criu/commit/8ce9e947051e43430eb2ff06b96dddeba467b4fd MSG_PEEK]) that would enable the retrieval of messages from a queue without removing them. This project aims to implement such an interface that allows retrieving all messages and their priorities from a POSIX message queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/2285
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ipc/mqueue.c
* https://www.man7.org/tlpi/download/TLPI-52-POSIX_Message_Queues.pdf

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Kubernetes operator for managing container checkpoints ===

'''Summary:''' Develop a Kubernetes operator that automates the management of container checkpoints

Container checkpointing has recently been introduced as an alpha feature in Kubernetes.
To enable this feature, the kubelet API was extended with an endpoint that enables the
creation of checkpoints for individual containers. By default, all container checkpoints
are stored as tar archives in <code>/var/lib/kubelet/checkpoints</code> using the following
file name format: <code>checkpoint-<pod-name>_<namespace-name>-<container-name>-<timestamp>.tar</code>.
However, the current implementation does not provide a mechanism for limiting the number
of checkpoints, which may lead to filling up all existing disk space. This project aims to
develop a Kubernetes operator that automates the management of checkpoints and provides
a garbage collection mechanism to discard obsolete checkpoints.

'''Links:'''
* https://github.com/checkpoint-restore/checkpoint-restore-operator
* https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/
* https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/
* https://kubernetes.io/blog/2023/03/10/forensic-container-analysis/
* https://github.com/kubernetes/kubernetes/pull/115888
* https://github.com/kubernetes/enhancements/issues/2008

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Adrian Reber <areber@redhat.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Adrian Reber

=== Add support for arm64 Guarded Control Stack (GCS) ===

'''Summary:''' Support arm64 Guarded Control Stack (GCS)

The arm64 Guarded Control Stack (GCS) feature provides support for
hardware protected stacks of return addresses, intended to provide
hardening against return oriented programming (ROP) attacks and to make
it easier to gather call stacks for applications such as profiling (taken from [1]).
We would like to support arm64 Guarded Control Stack (GCS) in CRIU, which means
that CRIU should be able to Checkpoint/Restore applications using GCS.

This task should not require any Linux kernel modifications
but will require a lot of effort to understand Linux kernel and
glibc support patches. We have a good example of support for
x86 shadow stack [4] thanks to Mike.

'''Links:'''
* [1] kernel support https://lore.kernel.org/all/20241001-arm64-gcs-v13-0-222b78d87eee@kernel.org
* [2] libc support https://inbox.sourceware.org/libc-alpha/20250117174119.3254972-1-yury.khrustalev@arm.com
* [3] libc tests https://inbox.sourceware.org/libc-alpha/20250210114538.1723249-1-yury.khrustalev@arm.com
* [4] x86 support (a great reference!) https://github.com/checkpoint-restore/criu/pull/2306

'''Details:'''
* Skill level: expert (a lot of moving parts: Linux kernel / libc / CRIU)
* Language: C
* Expected size: 350 hours
* Suggested by: Mike Rapoport <rppt@kernel.org>
* Mentors: Mike Rapoport <rppt@kernel.org>, Andrei Vagin <avagin@gmail.com>, Alexander Mikhalitsyn <alexander@mihalicyn.com>

== Suspended project ideas ==

Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.

=== Optimize logging engine ===

'''Summary:''' CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but ''if'' some operation fails -- the logs are the only way to find what was the reason for failure.

At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.

One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.

The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each <code>log(fmt, ...)</code> call gets translated into a call to a binary log function that saves <code>fmt</code> identifier copies all the args ''as is'' into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.

'''Links:'''
* [[Better logging]]

'''Details:'''
* Skill level: intermediate
* Language: C, though decoder/preprocessor can be in any language
* Expected size: 350 hours
* Suggested by: Andrei Vagin
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== IOUring support ===
The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.

'''Links:'''
* https://blogs.oracle.com/linux/an-introduction-to-the-io_uring-asynchronous-io-framework
* https://github.com/axboe/liburing

'''Details:'''
* Skill level: expert (+linux kernel)
* Expected size: 350 hours

=== Add support for SPFS ===

'''Summary:''' The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE

NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.

'''Links'''
* https://github.com/checkpoint-restore/criu/issues/60
* https://github.com/checkpoint-restore/criu/issues/53
* https://github.com/skinsbursky/spfs
* https://patchwork.criu.org/series/137/

'''Details:'''
* Skill level: expert
* Language: C
* Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Anonymise image files ===

'''Summary:''' Teach [[CRIT]] to remove sensitive information from images

When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.

List of data to shred:

* Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
* Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
* Registers.
* Process names. (But relations should be kept).
* Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
* Ghost files.
* Tarballs with tmpfs-s.
* IP addresses in sk-inet-s, ip tool dumps and net*.img.

'''Links:'''
* [[Anonymize image files]]
* https://github.com/checkpoint-restore/criu/issues/360
* [[CRIT]], [[Images]]
* External links to mailing lists or web sites

'''Details:'''
* Skill level: beginner
* Language: Python

=== Add support for checkpoint/restore of CORK-ed UDP socket ===

'''Summary:''' Support C/R of corked UDP socket

There's UDP_CORK option for sockets. As man page says:
<pre>
If this option is enabled, then all data output on this socket
is accumulated into a single datagram that is transmitted when
the option is disabled. This option should not be used in
code intended to be portable.
</pre>

Currently criu refuses to dump this case, so it's effectively a bug. Supporting
this will need extending the kernel API to allow criu read back the write queue
of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then
the queue is written into the image and is restored into the socket (with the CORK
bit set too).

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/409
* https://github.com/criupatchwork/criu/commit/a532312
* [[Sockets]], [[TCP connection]]
* [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]]

'''Details:'''
* Skill level: intermediate (+linux kernel)
* Language: C
* Expected size: 350 hours
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Andrei Vagin <avagin@gmail.com>

[[Category:GSoC]]
[[Category:Development]]

Google Summer of Code Ideas

2025-02-14T12:35:47Z

Amikhalitsyn: /* Suspended project ideas */

Google Summer of Code Ideas

2025-02-14T12:35:35Z

Amikhalitsyn: /* Project ideas */

Main Page

2025-02-14T12:21:36Z

Amikhalitsyn: update CRIU mailing list

<div style="float: {{{1|right}}}">
{{Download box|left}}
[[Image:4.0.c.jpg|right|340px]]
</div>
__NOTOC__
<big>Welcome to CRIU, a project to implement checkpoint/restore functionality for Linux.

Checkpoint/Restore In Userspace, or CRIU (pronounced kree-oo, IPA: /krɪʊ/, Russian: криу), is a Linux software. It can freeze a running container (or an individual application) and checkpoint its state to disk. The data saved can be used to restore the application and run it exactly as it was during the time of the freeze. Using this functionality, application or container live migration, snapshots, remote debugging, and [[usage scenarios|many other things]] are now possible.

CRIU started as a project of Virtuozzo, and grew with the tremendous help from the [[community]]. It is currently used by (integrated into) OpenVZ, [[LXC]]/LXD, [[Docker]], [[Podman]], and [[Integration|other software]], and [[packages|packaged for many Linux distributions]].
</big>
{{Like}}
<br clear="both">

<div class="m_right">
{{News block 2}}
</div>

<div class="m_left">
== Using ==

<big>
;Getting [[packages]] for your distribution
: Or try manual [[installation]] to have CRIU on your system
</big>

;[[CLI]], [[RPC]] and [[C API]]
: Three ways to start using the C/R functionality. [[:Category:API|More info]] about APIs.

;[[Usage scenarios]]
: Ideas how criu can be used (some are crazy indeed)

;[[:Category:HOWTO]]
: Collection of real world examples of how to use CRIU. Some are complex, some are not. HOW TO dump a [[simple loop]] might be the best one to start with. Also a set of [[asciinema]] records for real-life examples.

;[https://www.criu.org/index.php?title=FAQ FAQ] & [[When C/R fails]]
: A sort of troubleshooting guide

;[[What can change after C/R]]
: CRIU cannot (yet) save and restore every single bit of tasks' state. This page describes what bits visible through standard kernel API are such.

;[[What cannot be checkpointed]]
: What an application could do to make CRIU refuse to dump it.

;[[Contacts]]
: Ways to communicate with CRIU community

</div>

<div class="m_center">
== Developing ==
If you're interested in CRIU development, please subscribe to the criu mailing list: https://lore.kernel.org/criu/ (old one is https://lists.openvz.org/mailman/listinfo/criu)

;[[Images]]
: Description of image files format

;[[Plugins]]
: CRIU can call plugins provided by people

;[[Upstream kernel commits]]
: Mainline kernel commits tracker

;[[Recent commits]]
: CRIU tool repository commits

;[[Manpages]]
: Kernel's manpages commits tracker

;[[ZDTM Test Suite]]
: Zero downtime test suite

;[[Todo|TODO]]
: Current TODO list

;[[User namespace]]
: Implementing user namespace support

;[[Postulates]]
: What to keep in mind when writing new code

;[https://coveralls.io/github/checkpoint-restore/criu Code coverage results]
: Shows how zdtm run covers the criu code paths

;[[How to submit patches]]
:

</div>

<br clear="both">
<div class="m_left">
== Under the hood ==
* [[Checkpoint/Restore]]
* [[:Category:Under the hood]]
* [[:Category:Network]]
* [[:Category:Files]]
* [[:Category:Memory]]
* [[Pending signals]]
* [[Stages of restoring]]
* [[Code blobs]]
* [[Comparison to other CR projects]]
</div>

<div class="m_center">
== External links ==
{{:Articles}}
</div>

<div class="m_right">
== Misc ==
* [[Academic Research]]
* [[Podcasts]] and other audio/video interviews
* Project [[history]]
* [[Logo]] description
* [[Events]]
* [[CRIU acronym fun]]
</div>

Google Summer of Code Ideas

2025-02-14T12:19:22Z

Amikhalitsyn: update CRIU mailing list

Google Summer of Code Ideas

2024-03-12T17:45:50Z

Amikhalitsyn: /* Project ideas */

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.

This page contains project ideas for upcoming Google Summer of Code.

== Contacts ==

Please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@openvz.org mailing list] or write in [https://gitter.im/save-restore/criu gitter].

== Project ideas ==

=== Add support for memory compression ===

'''Summary:''' Support compression for page images

We would like to support memory page files compression
in CRIU using one of the fastest algorithms (it's matter
of discussion which one to choose!).

This task does not require any Linux kernel modifications
and scope is limited to CRIU itself. At the same time it's
complex enough as we need to touch memory dump/restore codepath
in CRIU and also handle many corner cases like page-server and stuff.

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Suggested by: Andrei Vagin <avagin@gmail.com>
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Andrei Vagin <avagin@gmail.com>

=== Add support for checkpoint/restore of CORK-ed UDP socket ===

'''Summary:''' Support C/R of corked UDP socket

There's UDP_CORK option for sockets. As man page says:
<pre>
If this option is enabled, then all data output on this socket
is accumulated into a single datagram that is transmitted when
the option is disabled. This option should not be used in
code intended to be portable.
</pre>

Currently criu refuses to dump this case, so it's effectively a bug. Supporting
this will need extending the kernel API to allow criu read back the write queue
of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then
the queue is written into the image and is restored into the socket (with the CORK
bit set too).

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/409
* https://github.com/criupatchwork/criu/commit/a532312
* [[Sockets]], [[TCP connection]]
* [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]]

'''Details:'''
* Skill level: intermediate (+linux kernel)
* Language: C
* Expected size: 350 hours
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Andrei Vagin <avagin@gmail.com>

=== Add support for pidfd file descriptors ===

'''Summary:''' Support C/R of pidfd descriptors

There is pidfd_open syscall which allows opening
a special PID file descriptor. A user can send a signal to
the process (pidfd_send_signal syscall), wait for the process
(poll() on pidfd).

At the moment CRIU can't dump processes that have pidfd's opened.

'''Links:'''
* https://lwn.net/Articles/801319/
* https://lwn.net/Articles/794707/
* https://github.com/torvalds/linux/blob/v5.16/kernel/fork.c#L1877

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Christian Brauner <christian@brauner.io>
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Use eBPF to lock and unlock the network ===

'''Summary:''' Use eBPF instead of external iptables-restore tool for network lock and unlock.

During checkpointing and restoring CRIU locks the network to make sure no network packets are accepted by the network stack during the time the process is checkpointed. Currently CRIU calls out to iptables-restore to create and delete the corresponding iptables rules. Another approach which avoids calling out to the external binary iptables-restore would be to directly inject eBPF rules. There have been reports from users that iptables-restore fails in some way and eBPF could avoid this external dependency.

'''Links:'''
* https://www.criu.org/TCP_connection#Checkpoint_and_restore_TCP_connection
* https://github.com/systemd/systemd/blob/master/src/core/bpf-firewall.c
* https://blog.zeyady.com/2021-08-16/gsoc-criu

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Adrian Reber <areber@redhat.com>

=== Files on detached mounts ===

'''Summary:''' Initial support of open files on "detached" mounts

When criu dumps a process with an open fd on a file, it gets the mount identifier (mnt_id) via /proc/<pid>/fdinfo/<fd>, so that criu knows from which exact mount the file was initially opened. This way criu can restore this fd by opening the same exact file from topologically the same mount in restored mount tree.

Restoring fd from the right mount can be important in different cases, for instance if the process would later want to resolve paths relative to the fd, and obviously resolving from the same file on different mount can lead to different resolved paths, or if the process wants to check path to the file via /proc/<pid>/fd/<fd>.

But we have a problem finding on which mount we need to reopen the file at restore if we only know mnt_id but can't find this mnt_id in /proc/<pid>/mountinfo.

Mountinfo file shows the mount tree topology of current mntns: parent - child relations, sharing group information, mountpoint and fs root information. And if we don't see mnt_id in it we don't know anything about this mount.

This can happen in two cases

* 1) external mount or file - if file was opened from e.g. host it's mount would not be visible in container mountinfo
* 2) mount was lazily unmounted

In case of 1) we have criu options to help criu handle external dependencies.

In case of 2) or no options provided criu can't resolve mnt_id in mountinfo and criu fails.

'''Solution:'''
We can handle 2) with: resolving major/minor via fstat, using name_to_handle_at and open_by_handle_at to open same file on any other available mount from same superblock (same major/minor) in container. Now we have fd2 of the same file as fd, but on existing mount we can dump it as usual instead, and mark it as "detached" in image, now criu on restore knows where to find this file, but instead of just opening fd2 from actually restored mount, we create a temporary bindmount which is lazy unmounted just after open making the file appear as a file on detached mount.

Known problems with this approach:

* Stat on btrfs gives wrong major/minor
* file handles does not work everywhere
* file handles can return fd2 on deleted file or on other hardlink, this needs special handling.

Additionally (optional part):
We can export real major/minor in fdinfo (kernel).
We can think of new kernel interface to get mount's major/minor and root (shift from fsroot) for detached mounts, if we have it we don't need file handle hack to find file on other mount (see fsinfo or getvalues kernel patches in LKML, can we add this info there?).

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Checkpointing of POSIX message queues ===

'''Summary:''' Add support for checkpoint/restore of POSIX message queues

POSIX message queues are a widely used inter-process communication mechanism. Message queues are implemented as files on a virtual filesystem (mqueue), where a file descriptor (message queue descriptor) is used to perform operations such as sending or receiving messages. To support checkpoint/restore of POSIX message queues, we need a kernel interface (similar to [https://github.com/checkpoint-restore/criu/commit/8ce9e947051e43430eb2ff06b96dddeba467b4fd MSG_PEEK]) that would enable the retrieval of messages from a queue without removing them. This project aims to implement such an interface that allows retrieving all messages and their priorities from a POSIX message queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/2285
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ipc/mqueue.c
* https://www.man7.org/tlpi/download/TLPI-52-POSIX_Message_Queues.pdf

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Kubernetes operator for managing container checkpoints ===

'''Summary:''' Develop a Kubernetes operator that automates the management of container checkpoints

Container checkpointing has recently been introduced as an alpha feature in Kubernetes.
To enable this feature, the kubelet API was extended with an endpoint that enables the
creation of checkpoints for individual containers. By default, all container checkpoints
are stored as tar archives in <code>/var/lib/kubelet/checkpoints</code> using the following
file name format: <code>checkpoint-<pod-name>_<namespace-name>-<container-name>-<timestamp>.tar</code>.
However, the current implementation does not provide a mechanism for limiting the number
of checkpoints, which may lead to filling up all existing disk space. This project aims to
develop a Kubernetes operator that automates the management of checkpoints and provides
a garbage collection mechanism to discard obsolete checkpoints.

'''Links:'''
* https://github.com/checkpoint-restore/checkpoint-restore-operator
* https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/
* https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/
* https://kubernetes.io/blog/2023/03/10/forensic-container-analysis/
* https://github.com/kubernetes/kubernetes/pull/115888
* https://github.com/kubernetes/enhancements/issues/2008

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Adrian Reber <areber@redhat.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Adrian Reber

== Suspended project ideas ==

Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.

=== Optimize logging engine ===

'''Summary:''' CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but ''if'' some operation fails -- the logs are the only way to find what was the reason for failure.

At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.

One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.

The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each <code>log(fmt, ...)</code> call gets translated into a call to a binary log function that saves <code>fmt</code> identifier copies all the args ''as is'' into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.

'''Links:'''
* [[Better logging]]

'''Details:'''
* Skill level: intermediate
* Language: C, though decoder/preprocessor can be in any language
* Expected size: 350 hours
* Suggested by: Andrei Vagin
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== IOUring support ===
The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.

'''Links:'''
* https://blogs.oracle.com/linux/an-introduction-to-the-io_uring-asynchronous-io-framework
* https://github.com/axboe/liburing

'''Details:'''
* Skill level: expert (+linux kernel)
* Expected size: 350 hours

=== Add support for SPFS ===

'''Summary:''' The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE

NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.

'''Links'''
* https://github.com/checkpoint-restore/criu/issues/60
* https://github.com/checkpoint-restore/criu/issues/53
* https://github.com/skinsbursky/spfs
* https://patchwork.criu.org/series/137/

'''Details:'''
* Skill level: expert
* Language: C
* Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Anonymise image files ===

'''Summary:''' Teach [[CRIT]] to remove sensitive information from images

When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.

List of data to shred:

* Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
* Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
* Registers.
* Process names. (But relations should be kept).
* Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
* Ghost files.
* Tarballs with tmpfs-s.
* IP addresses in sk-inet-s, ip tool dumps and net*.img.

'''Links:'''
* [[Anonymize image files]]
* https://github.com/checkpoint-restore/criu/issues/360
* [[CRIT]], [[Images]]
* External links to mailing lists or web sites

'''Details:'''
* Skill level: beginner
* Language: Python

[[Category:GSoC]]
[[Category:Development]]

Google Summer of Code Ideas

2024-03-12T16:33:03Z

Amikhalitsyn: /* Project ideas */

Google Summer of Code Ideas

2024-03-12T16:32:54Z

Amikhalitsyn: /* Suspended project ideas */ moved "Optimize logging engine" to suspended projects list

Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.

This page contains project ideas for upcoming Google Summer of Code.

== Contacts ==

Please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the [mailto:criu@openvz.org mailing list] or write in [https://gitter.im/save-restore/criu gitter].

== Project ideas ==

=== Optimize logging engine ===

'''Summary:''' CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but ''if'' some operation fails -- the logs are the only way to find what was the reason for failure.

At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.

One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.

The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each <code>log(fmt, ...)</code> call gets translated into a call to a binary log function that saves <code>fmt</code> identifier copies all the args ''as is'' into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.

'''Links:'''
* [[Better logging]]

'''Details:'''
* Skill level: intermediate
* Language: C, though decoder/preprocessor can be in any language
* Expected size: 350 hours
* Suggested by: Andrei Vagin
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Add support for checkpoint/restore of CORK-ed UDP socket ===

'''Summary:''' Support C/R of corked UDP socket

There's UDP_CORK option for sockets. As man page says:
<pre>
If this option is enabled, then all data output on this socket
is accumulated into a single datagram that is transmitted when
the option is disabled. This option should not be used in
code intended to be portable.
</pre>

Currently criu refuses to dump this case, so it's effectively a bug. Supporting
this will need extending the kernel API to allow criu read back the write queue
of the socket (see [[TCP connection|how it's done]] for TCP sockets, for example). Then
the queue is written into the image and is restored into the socket (with the CORK
bit set too).

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/409
* https://github.com/criupatchwork/criu/commit/a532312
* [[Sockets]], [[TCP connection]]
* [[https://groups.google.com/forum/#!topic/comp.os.linux.networking/Uz8PYiTCZSg UDP cork explained]]

'''Details:'''
* Skill level: intermediate (+linux kernel)
* Language: C
* Expected size: 350 hours
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Andrei Vagin <avagin@gmail.com>

=== Add support for pidfd file descriptors ===

'''Summary:''' Support C/R of pidfd descriptors

There is pidfd_open syscall which allows opening
a special PID file descriptor. A user can send a signal to
the process (pidfd_send_signal syscall), wait for the process
(poll() on pidfd).

At the moment CRIU can't dump processes that have pidfd's opened.

'''Links:'''
* https://lwn.net/Articles/801319/
* https://lwn.net/Articles/794707/
* https://github.com/torvalds/linux/blob/v5.16/kernel/fork.c#L1877

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Christian Brauner <christian@brauner.io>
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Use eBPF to lock and unlock the network ===

'''Summary:''' Use eBPF instead of external iptables-restore tool for network lock and unlock.

During checkpointing and restoring CRIU locks the network to make sure no network packets are accepted by the network stack during the time the process is checkpointed. Currently CRIU calls out to iptables-restore to create and delete the corresponding iptables rules. Another approach which avoids calling out to the external binary iptables-restore would be to directly inject eBPF rules. There have been reports from users that iptables-restore fails in some way and eBPF could avoid this external dependency.

'''Links:'''
* https://www.criu.org/TCP_connection#Checkpoint_and_restore_TCP_connection
* https://github.com/systemd/systemd/blob/master/src/core/bpf-firewall.c
* https://blog.zeyady.com/2021-08-16/gsoc-criu

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Adrian Reber <areber@redhat.com>

=== Files on detached mounts ===

'''Summary:''' Initial support of open files on "detached" mounts

When criu dumps a process with an open fd on a file, it gets the mount identifier (mnt_id) via /proc/<pid>/fdinfo/<fd>, so that criu knows from which exact mount the file was initially opened. This way criu can restore this fd by opening the same exact file from topologically the same mount in restored mount tree.

Restoring fd from the right mount can be important in different cases, for instance if the process would later want to resolve paths relative to the fd, and obviously resolving from the same file on different mount can lead to different resolved paths, or if the process wants to check path to the file via /proc/<pid>/fd/<fd>.

But we have a problem finding on which mount we need to reopen the file at restore if we only know mnt_id but can't find this mnt_id in /proc/<pid>/mountinfo.

Mountinfo file shows the mount tree topology of current mntns: parent - child relations, sharing group information, mountpoint and fs root information. And if we don't see mnt_id in it we don't know anything about this mount.

This can happen in two cases

* 1) external mount or file - if file was opened from e.g. host it's mount would not be visible in container mountinfo
* 2) mount was lazily unmounted

In case of 1) we have criu options to help criu handle external dependencies.

In case of 2) or no options provided criu can't resolve mnt_id in mountinfo and criu fails.

'''Solution:'''
We can handle 2) with: resolving major/minor via fstat, using name_to_handle_at and open_by_handle_at to open same file on any other available mount from same superblock (same major/minor) in container. Now we have fd2 of the same file as fd, but on existing mount we can dump it as usual instead, and mark it as "detached" in image, now criu on restore knows where to find this file, but instead of just opening fd2 from actually restored mount, we create a temporary bindmount which is lazy unmounted just after open making the file appear as a file on detached mount.

Known problems with this approach:

* Stat on btrfs gives wrong major/minor
* file handles does not work everywhere
* file handles can return fd2 on deleted file or on other hardlink, this needs special handling.

Additionally (optional part):
We can export real major/minor in fdinfo (kernel).
We can think of new kernel interface to get mount's major/minor and root (shift from fsroot) for detached mounts, if we have it we don't need file handle hack to find file on other mount (see fsinfo or getvalues kernel patches in LKML, can we add this info there?).

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Checkpointing of POSIX message queues ===

'''Summary:''' Add support for checkpoint/restore of POSIX message queues

POSIX message queues are a widely used inter-process communication mechanism. Message queues are implemented as files on a virtual filesystem (mqueue), where a file descriptor (message queue descriptor) is used to perform operations such as sending or receiving messages. To support checkpoint/restore of POSIX message queues, we need a kernel interface (similar to [https://github.com/checkpoint-restore/criu/commit/8ce9e947051e43430eb2ff06b96dddeba467b4fd MSG_PEEK]) that would enable the retrieval of messages from a queue without removing them. This project aims to implement such an interface that allows retrieving all messages and their priorities from a POSIX message queue.

'''Links:'''
* https://github.com/checkpoint-restore/criu/issues/2285
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ipc/mqueue.c
* https://www.man7.org/tlpi/download/TLPI-52-POSIX_Message_Queues.pdf

'''Details:'''
* Skill level: intermediate
* Language: C
* Expected size: 350 hours
* Mentors: Radostin Stoyanov <rstoyanov@fedoraproject.org>, Pavel Tikhomirov <ptikhomirov@virtuozzo.com>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

=== Kubernetes operator for managing container checkpoints ===

'''Summary:''' Develop a Kubernetes operator that automates the management of container checkpoints

Container checkpointing has recently been introduced as an alpha feature in Kubernetes.
To enable this feature, the kubelet API was extended with an endpoint that enables the
creation of checkpoints for individual containers. By default, all container checkpoints
are stored as tar archives in <code>/var/lib/kubelet/checkpoints</code> using the following
file name format: <code>checkpoint-<pod-name>_<namespace-name>-<container-name>-<timestamp>.tar</code>.
However, the current implementation does not provide a mechanism for limiting the number
of checkpoints, which may lead to filling up all existing disk space. This project aims to
develop a Kubernetes operator that automates the management of checkpoints and provides
a garbage collection mechanism to discard obsolete checkpoints.

'''Links:'''
* https://github.com/checkpoint-restore/checkpoint-restore-operator
* https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/
* https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/
* https://kubernetes.io/blog/2023/03/10/forensic-container-analysis/
* https://github.com/kubernetes/kubernetes/pull/115888
* https://github.com/kubernetes/enhancements/issues/2008

'''Details:'''
* Skill level: intermediate
* Language: Go
* Expected size: 350 hours
* Mentors: Adrian Reber <areber@redhat.com>, Radostin Stoyanov <rstoyanov@fedoraproject.org>, Prajwal S N <prajwalnadig21@gmail.com>
* Suggested by: Adrian Reber

== Suspended project ideas ==

Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.

=== Optimize logging engine ===

'''Summary:''' CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but ''if'' some operation fails -- the logs are the only way to find what was the reason for failure.

At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.

One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.

The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each <code>log(fmt, ...)</code> call gets translated into a call to a binary log function that saves <code>fmt</code> identifier copies all the args ''as is'' into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.

'''Links:'''
* [[Better logging]]

'''Details:'''
* Skill level: intermediate
* Language: C, though decoder/preprocessor can be in any language
* Expected size: 350 hours
* Suggested by: Andrei Vagin
* Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== IOUring support ===
The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.

'''Links:'''
* https://blogs.oracle.com/linux/an-introduction-to-the-io_uring-asynchronous-io-framework
* https://github.com/axboe/liburing

'''Details:'''
* Skill level: expert (+linux kernel)
* Expected size: 350 hours

=== Add support for SPFS ===

'''Summary:''' The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE

NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.

'''Links'''
* https://github.com/checkpoint-restore/criu/issues/60
* https://github.com/checkpoint-restore/criu/issues/53
* https://github.com/skinsbursky/spfs
* https://patchwork.criu.org/series/137/

'''Details:'''
* Skill level: expert
* Language: C
* Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com>
* Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>

=== Anonymise image files ===

'''Summary:''' Teach [[CRIT]] to remove sensitive information from images

When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.

List of data to shred:

* Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
* Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
* Registers.
* Process names. (But relations should be kept).
* Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
* Ghost files.
* Tarballs with tmpfs-s.
* IP addresses in sk-inet-s, ip tool dumps and net*.img.

'''Links:'''
* [[Anonymize image files]]
* https://github.com/checkpoint-restore/criu/issues/360
* [[CRIT]], [[Images]]
* External links to mailing lists or web sites

'''Details:'''
* Skill level: beginner
* Language: Python

[[Category:GSoC]]
[[Category:Development]]

Supported architectures

2023-08-07T16:58:12Z

Amikhalitsyn: Add page with list/info about all supported architectures

{| class="wikitable" style="margin:auto"

|-
! Arch !! Status !! Link to PR when added !! Comment or wiki page link

|-
| x86_64
| Supported
|
| Main architecture

|-
| s390
| Maintained
|
| -

|-
| ppc64le
| Maintained
|
| -

|-
| arm
| Maintained
|
| -

|-
| arm64
| Maintained
|
| -

|-
| loongarch
| Maintained
| https://github.com/checkpoint-restore/criu/pull/2183
| -

|-
| riscv
| In development
| https://github.com/checkpoint-restore/criu/pull/2234
| -

|-
| mips
| Unknown/Orphan
| https://github.com/checkpoint-restore/criu/pull/933
| -

|}

Google Summer of Code

2023-03-17T11:39:16Z

Amikhalitsyn: /* What are the requirements to be accepted for GSoC? */

[[Category:GSoC]]

== Introduction ==

[https://summerofcode.withgoogle.com/ Google Summer of Code] (GSoC) is a program where Google pairs university students with open source organizations to work together over the summer. For their efforts, students are rewarded with stipends provided by Google. During the summer break students work on ideas for open-souce projects while mentors from the projects help them by providing support and code reviews.

== What are the requirements to be accepted for GSoC? ==
Apart from the [https://developers.google.com/open-source/gsoc/faq#what_are_the_eligibility_requirements_for_participation official requirements] you will have to meet the following:

* Successfully clone and compile CRIU
* Subscribe to the [https://openvz.org/mailman/listinfo/criu mailing list]
* Join the CRIU community in [https://gitter.im/save-restore/CRIU Gitter]
* Read [[GSoC Students Recommendations]]
* Have a small upstream contribution ([https://github.com/checkpoint-restore/criu/issues?q=is%3Aissue+is%3Aopen+label%3Agood-first-issue good first issues])

== How can I have an upstream contribution? ==
As strange as it might look like, this is a proof that you have managed to clone and build your own CRIU, and subscribed to the list.
Don't worry, there are no requirements on the initial contribution when it comes to complexity or a size of the change.
Before working on a patch, please read [[Installation]], [[How to submit patches]] and [https://github.com/checkpoint-restore/criu/blob/criu-dev/CONTRIBUTING.md Contributing guide]

== Project Ideas ==

The project ideas are listed in [[Google Summer of Code Ideas]].

== Completed projects ==

Several projects successfully [[GSoC completed projects|graduated GSoC]] and they are now an integral part of CRIU

Google Summer of Code

2023-03-17T11:34:51Z

Amikhalitsyn: /* What are the requirements to be accepted for GSoC? */ add Gitter

[[Category:GSoC]]

== Introduction ==

[https://summerofcode.withgoogle.com/ Google Summer of Code] (GSoC) is a program where Google pairs university students with open source organizations to work together over the summer. For their efforts, students are rewarded with stipends provided by Google. During the summer break students work on ideas for open-souce projects while mentors from the projects help them by providing support and code reviews.

== What are the requirements to be accepted for GSoC? ==
Apart from the [https://developers.google.com/open-source/gsoc/faq#what_are_the_eligibility_requirements_for_participation official requirements] you will have to meet the following:

* Successfully clone and compile CRIU
* Subscribe to the [https://openvz.org/mailman/listinfo/criu mailing list]
* Join the CRIU community in [https://gitter.im/save-restore/CRIU Gitter]
* Have a small upstream contribution ([https://github.com/checkpoint-restore/criu/issues?q=is%3Aissue+is%3Aopen+label%3Agood-first-issue good first issues])

== How can I have an upstream contribution? ==
As strange as it might look like, this is a proof that you have managed to clone and build your own CRIU, and subscribed to the list.
Don't worry, there are no requirements on the initial contribution when it comes to complexity or a size of the change.
Before working on a patch, please read [[Installation]], [[How to submit patches]] and [https://github.com/checkpoint-restore/criu/blob/criu-dev/CONTRIBUTING.md Contributing guide]

== Project Ideas ==

The project ideas are listed in [[Google Summer of Code Ideas]].

== Completed projects ==

Several projects successfully [[GSoC completed projects|graduated GSoC]] and they are now an integral part of CRIU

Google Summer of Code

2023-03-13T10:28:43Z

Amikhalitsyn: added good first issues link

[[Category:GSoC]]

== Introduction ==

[https://summerofcode.withgoogle.com/ Google Summer of Code] (GSoC) is a program where Google pairs university students with open source organizations to work together over the summer. For their efforts, students are rewarded with stipends provided by Google. During the summer break students work on ideas for open-souce projects while mentors from the projects help them by providing support and code reviews.

== What are the requirements to be accepted for GSoC? ==
Apart from the [https://developers.google.com/open-source/gsoc/faq#what_are_the_eligibility_requirements_for_participation official requirements] you will have to meet the following:

* Successfully clone and compile CRIU
* Subscribe to the [https://openvz.org/mailman/listinfo/criu mailing list]
* Have a small upstream contribution ([https://github.com/checkpoint-restore/criu/issues?q=is%3Aissue+is%3Aopen+label%3Agood-first-issue good first issues])

== How can I have an upstream contribution? ==
As strange as it might look like, this is a proof that you have managed to clone and build your own CRIU, and subscribed to the list.
Don't worry, there are no requirements on the initial contribution when it comes to complexity or a size of the change.
Before working on a patch, please read [[Installation]], [[How to submit patches]] and [https://github.com/checkpoint-restore/criu/blob/criu-dev/CONTRIBUTING.md Contributing guide]

== Project Ideas ==

The project ideas are listed in [[Google Summer of Code Ideas]].

== Completed projects ==

Several projects successfully [[GSoC completed projects|graduated GSoC]] and they are now an integral part of CRIU

Google Summer of Code Ideas

2023-02-27T17:29:17Z

Amikhalitsyn: updated my email

File:3.17.jpg

2022-04-25T16:03:01Z

Amikhalitsyn: Amikhalitsyn moved page File:3.17.JPG to File:3.17.jpg without leaving a redirect

== Summary ==
http://woodswalksandwildlife.blogspot.com/2012/09/american-redstart.html

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

File:3.17.jpg

2022-04-25T16:01:34Z

Amikhalitsyn: http://woodswalksandwildlife.blogspot.com/2012/09/american-redstart.html This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

== Summary ==
http://woodswalksandwildlife.blogspot.com/2012/09/american-redstart.html

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Download/criu/3.17

2022-04-25T15:51:39Z

Amikhalitsyn: Created page with " right {{Release..."

[[Image:3.17.jpg|400px|right]]
{{Release|3.17}}

=== New features ===
* Introduced [[Mount-v2|mount-v2 engine]]
* Added support for MAP_HUGETLB mappings
* Added support for [[Rseq|Linux Restartable Sequences]]
* Added support for SOCK_SEQPACKET unix sockets
* CRIU AMD GPU plugin

=== Bugfixes ===
* GCC 12 compatibility fixes
* cgroup: fix --manage-cgroups=ignore
* several memory leaks fixed in net, files, mount, tun and config subsystems

=== Improvements ===
* bpf: switch from deprecated bpf_create_map_xattr to bpf_map_create
* bpfmap: handle map_extra field
* setsockopt(SO_BUF_LOCK) support for tcp sockets

Template:Codename

2022-04-25T15:34:15Z

Amikhalitsyn: add v3.17 Radiant Redstart

<includeonly>{{#switch: v{{{1}}}
| v2.1 = Steel Lapwing
| v2.2 = Carbon Nightingale
| v2.3 = Wooden Duck
| v2.4 = Marble Lark
| v2.5 = Concrete Oriole
| v2.6 = Paper Crane
| v2.7 = Rubber Owl
| v2.8 = Bronze Siskin
| v2.9 = Silk Tit
| v2.10 = Brass Waxwing
| v2.11 = Acrylic Bullfinch
| v2.12 = Vulcanite Rook
| v3.0 = Basalt Wagtail
| v3.1 = Graphene Swift
| v3.2 = Tin Hoopoe
| v3.3 = Crystal Pelican
| v3.4 = Cobalt Swan
| v3.5 = Clay Jay
| v3.6 = Alabaster Finch
| v3.7 = Vinyl Magpie
| v3.8 = Snow Bunting
| v3.9 = Sand Martin
| v3.10 = Granite Eagle
| v3.11 = Glass Flamingo
| v3.12 = Ice Penguin
| v3.13 = Silicon Willet
| v3.14 = Platinum Peacock
| v3.15 = Titanium Falcon
| v3.16 = Petrified Puffin
| v3.16.1 = Petrified Puffin
| v3.17 = Radiant Redstart
| 
}}</includeonly><noinclude>

This template is used to get the codename of a specified release. If there is no codename, empty string is produced.

== Usage ==

<pre><nowiki>{{Codename|VERSION}}</nowiki></pre>

== Examples ==

{| class="wikitable"
! Markup
! Result
|-
| <pre><nowiki>{{Codename|3.3}}</nowiki></pre>
| {{Codename|3.3}}
|-
| <pre><nowiki>{{Codename|2.1}}</nowiki></pre>
| {{Codename|2.1}}
|-
| <pre><nowiki>{{Codename|2.0}}</nowiki></pre>
| {{Codename|2.0}}
|}

== See also ==

* [[Release schedule]] for future codenames
* [[Template:Release date]]
* [[Template:criu]]
* [[Template:Release]]
* [[Template:Latest release]]

</noinclude>

Restartable Sequences

2022-04-20T06:20:52Z

Amikhalitsyn:

"Restartable sequences" (<code>rseq</code>) are small segments of user-space code designed to access per-CPU data structures without the need for heavyweight locking.
rseq is supported since Linux kernel 4.18 [1]

I strongly suggest reading the article [https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/ Linux restartable sequences] before this one.

== Linux kernel interface ==

The Linux kernel interface for rseq is fairly simple. It's just <code>rseq</code> syscall:
<code>sys_rseq(struct rseq *rseq, uint32_t rseq_len, int flags, uint32_t sig)</code>

<pre>
enum rseq_cs_flags {
RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT),
RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
};

struct rseq_cs {
__u32 version; /* always 0 at this moment */
enum rseq_cs_flags flags;
void *start_ip;
/* Offset from start_ip. */
intptr_t post_commit_offset;
void *abort_ip;
}

struct rseq {
__u32 cpu_id_start;
__u32 cpu_id;
struct rseq_cs *rseq_cs;
enum rseq_cs_flags flags;
}
</pre>

From the userspace side, we need to keep <code>struct rseq</code> somewhere and register it on the kernel side using the <code>rseq</code> syscall.
Then, once we want to execute some code as a rseq critical section (<code>rseq cs</code> or just CS) we need to allocate and fill with the data
<code>struct rseq_cs</code>. We have to specify the start address of our CS, and the address of the abort handler (called when CS was interrupted by a preemption, migration
or signal). Then we need to put an pointer to <code>struct rseq_cs</code> to the <code>(struct rseq).rseq_cs</code> field.

== What about <code>flags</code>? ==

You may have noticed that both <code>struct rseq</code> and <code>struct rseq_cs</code> have <code>flags</code> field. It may took values from <code>enum rseq_cs_flags</code>.

First of all, a user may specify flags in any place they will be combined on the kernel side:
<pre>
static int rseq_need_restart(struct task_struct *t, u32 cs_flags)
{
u32 flags, event_mask;
int ret;

/* Get thread flags. */
ret = get_user(flags, &t->rseq->flags);
if (ret)
return ret;

/* Take critical section flags into account. */
flags |= cs_flags; // <<<<<<<< here we have flags combined from struct rseq + struct rseq_cs
</pre>

The most common <code>flags</code> value is zero. In this case, the rseq CS will be interrupted (and IP will be fixed up to the abort handler)
if preemption, migration, or signal occurs. But there are situations when users may want not to abort section once one of these events happen.

It's worth mentioning that <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> can be used only in combination with <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> and <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>:
<pre>
/*
* Restart on signal can only be inhibited when restart on
* preempt and restart on migrate are inhibited too. Otherwise,
* a preempted signal handler could fail to restart the prior
* execution context on sigreturn.
*/
if (unlikely((flags & RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL) &&
(flags & RSEQ_CS_PREEMPT_MIGRATE_FLAGS) !=
RSEQ_CS_PREEMPT_MIGRATE_FLAGS))
return -EINVAL;
</pre>

== How CRIU handles rseq ==

CRIU handles the rseq differently depending on the particular case. Let's classify and cover all of them.

# the process is not inside the rseq critical section
# the process is inside the rseq CS
## <code>flags</code> is <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>
## <code>flags</code> is <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code>

=== the process is not inside the rseq critical section ===

Simplest case. Process just have <code>struct rseq</code> registered in the kernel but currently instruction pointer (IP) not inside CS.

==== Dump ====
We need only to determine where the <code>struct rseq</code> is and dump its address length and signature.
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).

==== Restore ====
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)

=== inside CS: <code>flags</code> is <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code> ===

The process was caught with IP inside CS. Can we act as before? So, dump <code>struct rseq</code> address, restore it, and so on. No, we can't.
The reason is that CRIU saves IP as it was during the dump. But the rseq semantic is to jump to abort handler if CS execution was interrupted.
In this particular case we have <code>flags</code> equal to <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>
it means that if CS will be interrupted by the preeption, migration (<code>0</code>) or migration (<code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code>) or preemption (<code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>)
the kernel will fixup IP of the process to the abort handler address.

When we dump the process using CRIU it will just save IP as it was and restore it. That's a serious problem and this may break the user application (even cause crash!).

Lets see <code>fixup_thread_rseq</code> function:
<pre>
if (task_in_rseq(rseq_cs, TI_IP(core))) {
struct pid *tid = &item->threads[i];

...

pr_warn("The %d task is in rseq critical section. IP will be set to rseq abort handler addr\n",
tid->real);

...

if (!(rseq_cs->flags & RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL)) {
pr_warn("The %d task is in rseq critical section. IP will be set to rseq abort handler addr\n",
tid->real);

TI_IP(core) = rseq_cs->abort_ip;

if (item->pid->real == tid->real) {
compel_set_leader_ip(dmpi(item)->parasite_ctl, rseq_cs->abort_ip);
} else {
compel_set_thread_ip(dmpi(item)->thread_ctls[i], rseq_cs->abort_ip);
}
}
}
</pre>

It checks that process IP inside CS and fixes it up to the abort handler IP as the kernel does.

==== Dump ====
We need to determine where the <code>struct rseq</code> is and dump its address length and signature.
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).

We have to fix up IP to the abort handler.

==== Restore ====
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)

No additional actions here. The process will be restored and will continue execution from the abort handler (not within the rseq CS!).

=== inside CS: <code>flags</code> is <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> ===

Rare case, but we support it too. If the rseq CS has <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> flag it means that its technically
non-abortable. So, from the first glance, it seems like we can just not do anything special: save rseq structure address, not fix up IP.
This is incorrect.

The kernel will clean up <code>(struct rseq).rseq_cs</code> pointer once we jump into the parasite on the dump:
<pre>
static int rseq_ip_fixup(struct pt_regs *regs)
{
...

/*
* Handle potentially not being within a critical section.
* If not nested over a rseq critical section, restart is useless.
* Clear the rseq_cs pointer and return.
*/
if (!in_rseq_cs(ip, &rseq_cs))
return clear_rseq_cs(t);
</pre>

and after the restore process will continue the rseq CS execution from the same place (it's okay) but from the kernel point of view,
the process will continue this execution as not being within the rseq CS (that's bad!). Because the kernel determines execution context from the <code>(struct rseq).rseq_cs</code> field.

==== Dump ====
We need to determine where the <code>struct rseq</code> is and dump its address length and signature.
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).

We save IP as it was (not doing fixup), but we have to save <code>(struct rseq).rseq_cs</code> field into the CRIU image.

==== Restore ====
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)

We need to restore <code>(struct rseq).rseq_cs</code> memory externaly using ptrace <code>POKEAREA</code> (see <code>restore_rseq_cs</code>).

== TODO ==

* tests for all architectures (right now we have ZDTM tests only for x86_64)
* improvement support of built-in rseq for non-Glibc libraries
* pre-dump tests (?)
* leave-running tests (?)
* crfail test
* threaded test

== Useful links ==

* [1] https://github.com/torvalds/linux/blob/b2d229d4ddb17db541098b83524d901257e93845/kernel/rseq.c#L1
* [2] https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/
* [3] https://lwn.net/Articles/883104/
* [4] https://patchwork.sourceware.org/project/glibc/list/?series=5530&state=*

[[Category: Under the hood]]
[[Category: Editor help needed]]

Restartable Sequences

2022-04-18T12:41:57Z

Amikhalitsyn:

"Restartable sequences" (<code>rseq</code>) are small segments of user-space code designed to access per-CPU data structures without the need for heavyweight locking.
rseq is supported since Linux kernel 4.18 [1]

I strongly suggest reading the article [https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/ Linux restartable sequences] before this one.

== Linux kernel interface ==

The Linux kernel interface for rseq is fairly simple. It's just <code>rseq</code> syscall:
<code>sys_rseq(struct rseq *rseq, uint32_t rseq_len, int flags, uint32_t sig)</code>

<pre>
enum rseq_cs_flags {
RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT),
RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
};

struct rseq_cs {
__u32 version; /* always 0 at this moment */
enum rseq_cs_flags flags;
void *start_ip;
/* Offset from start_ip. */
intptr_t post_commit_offset;
void *abort_ip;
}

struct rseq {
__u32 cpu_id_start;
__u32 cpu_id;
struct rseq_cs *rseq_cs;
enum rseq_cs_flags flags;
}
</pre>

From the userspace side, we need to keep <code>struct rseq</code> somewhere and register it on the kernel side using the <code>rseq</code> syscall.
Then, once we want to execute some code as a rseq critical section (<code>rseq cs</code> or just CS) we need to allocate and fill with the data
<code>struct rseq_cs</code>. We have to specify the start address of our CS, and the address of the abort handler (called when CS was interrupted by a preemption, migration
or signal). Then we need to put an pointer to <code>struct rseq_cs</code> to the <code>(struct rseq).rseq_cs</code> field.

== What about <code>flags</code>? ==

You may have noticed that both <code>struct rseq</code> and <code>struct rseq_cs</code> have <code>flags</code> field. It may took values from <code>enum rseq_cs_flags</code>.

First of all, a user may specify flags in any place they will be combined on the kernel side:
<pre>
static int rseq_need_restart(struct task_struct *t, u32 cs_flags)
{
u32 flags, event_mask;
int ret;

/* Get thread flags. */
ret = get_user(flags, &t->rseq->flags);
if (ret)
return ret;

/* Take critical section flags into account. */
flags |= cs_flags; // <<<<<<<< here we have flags combined from struct rseq + struct rseq_cs
</pre>

The most common <code>flags</code> value is zero. In this case, the rseq CS will be interrupted (and IP will be fixed up to the abort handler)
if preemption, migration, or signal occurs. But there are situations when users may want not to abort section once one of these events happen.

It's worth mentioning that <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> can be used only in combination with <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> and <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>:
<pre>
/*
* Restart on signal can only be inhibited when restart on
* preempt and restart on migrate are inhibited too. Otherwise,
* a preempted signal handler could fail to restart the prior
* execution context on sigreturn.
*/
if (unlikely((flags & RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL) &&
(flags & RSEQ_CS_PREEMPT_MIGRATE_FLAGS) !=
RSEQ_CS_PREEMPT_MIGRATE_FLAGS))
return -EINVAL;
</pre>

== How CRIU handles rseq ==

CRIU handles the rseq differently depending on the particular case. Let's classify and cover all of them.

# the process is not inside the rseq critical section
# the process is inside the rseq CS
## <code>flags</code> is <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>
## <code>flags</code> is <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code>

=== the process is not inside the rseq critical section ===

Simplest case. Process just have <code>struct rseq</code> registered in the kernel but currently instruction pointer (IP) not inside CS.

==== Dump ====
We need only to determine where the <code>struct rseq</code> is and dump its address length and signature.
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).

==== Restore ====
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)

=== inside CS: <code>flags</code> is <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code> ===

The process was caught with IP inside CS. Can we act as before? So, dump <code>struct rseq</code> address, restore it, and so on. No, we can't.
The reason is that CRIU saves IP as it was during the dump. But the rseq semantic is to jump to abort handler if CS execution was interrupted.
In this particular case we have <code>flags</code> equal to <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>
it means that if CS will be interrupted by the preeption, migration (<code>0</code>) or migration (<code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code>) or preemption (<code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>)
the kernel will fixup IP of the process to the abort handler address.

When we dump the process using CRIU it will just save IP as it was and restore it. That's a serious problem and this may break the user application (even cause crash!).

Lets see <code>fixup_thread_rseq</code> function:
<pre>
if (task_in_rseq(rseq_cs, TI_IP(core))) {
struct pid *tid = &item->threads[i];

...

pr_warn("The %d task is in rseq critical section. IP will be set to rseq abort handler addr\n",
tid->real);

...

if (!(rseq_cs->flags & RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL)) {
pr_warn("The %d task is in rseq critical section. IP will be set to rseq abort handler addr\n",
tid->real);

TI_IP(core) = rseq_cs->abort_ip;

if (item->pid->real == tid->real) {
compel_set_leader_ip(dmpi(item)->parasite_ctl, rseq_cs->abort_ip);
} else {
compel_set_thread_ip(dmpi(item)->thread_ctls[i], rseq_cs->abort_ip);
}
}
}
</pre>

It checks that process IP inside CS and fixes it up to the abort handler IP as the kernel does.

==== Dump ====
We need to determine where the <code>struct rseq</code> is and dump its address length and signature.
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).

We have to fix up IP to the abort handler.

==== Restore ====
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)

No additional actions here. The process will be restored and will continue execution from the abort handler (not within the rseq CS!).

=== inside CS: <code>flags</code> is <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> ===

Rare case, but we support it too. If the rseq CS has <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> flag it means that its technically
non-abortable. So, from the first glance, it seems like we can just not do anything special: save rseq structure address, not fix up IP.
This is incorrect.

The kernel will clean up <code>(struct rseq).rseq_cs</code> pointer once we jump into the parasite on the dump:
<pre>
static int rseq_ip_fixup(struct pt_regs *regs)
{
...

/*
* Handle potentially not being within a critical section.
* If not nested over a rseq critical section, restart is useless.
* Clear the rseq_cs pointer and return.
*/
if (!in_rseq_cs(ip, &rseq_cs))
return clear_rseq_cs(t);
</pre>

and after the restore process will continue the rseq CS execution from the same place (it's okay) but from the kernel point of view,
the process will continue this execution as not being within the rseq CS (that's bad!). Because the kernel determines execution context from the <code>(struct rseq).rseq_cs</code> field.

==== Dump ====
We need to determine where the <code>struct rseq</code> is and dump its address length and signature.
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).

We save IP as it was (not doing fixup), but we have to save <code>(struct rseq).rseq_cs</code> field into the CRIU image.

==== Restore ====
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)

We need to restore <code>(struct rseq).rseq_cs</code> memory externaly using ptrace <code>POKEAREA</code> (see <code>restore_rseq_cs</code>).

== TODO ==

* tests for all architectures (right now we have ZDTM tests only for x86_64)
* improvement support of built-in rseq for non-Glibc libraries
* pre-dump tests (?)
* leave-running tests (?)
* crfail test
* threaded test

== Useful links ==

* [1] https://github.com/torvalds/linux/blob/b2d229d4ddb17db541098b83524d901257e93845/kernel/rseq.c#L1
* [2] https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/
* [3] https://lwn.net/Articles/883104/

[[Category: Under the hood]]
[[Category: Editor help needed]]

Restartable Sequences

2022-04-18T12:22:09Z

Amikhalitsyn:

"Restartable sequences" (<code>rseq</code>) are small segments of user-space code designed to access per-CPU data structures without the need for heavyweight locking.
rseq is supported since Linux kernel 4.18 [1]

I strongly suggest reading the article [https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/ Linux restartable sequences] before this one.

== Linux kernel interface ==

The Linux kernel interface for rseq is fairly simple. It's just <code>rseq</code> syscall:
<code>sys_rseq(struct rseq *rseq, uint32_t rseq_len, int flags, uint32_t sig)</code>

<pre>
enum rseq_cs_flags {
RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT),
RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
};

struct rseq_cs {
__u32 version; /* always 0 at this moment */
enum rseq_cs_flags flags;
void *start_ip;
/* Offset from start_ip. */
intptr_t post_commit_offset;
void *abort_ip;
}

struct rseq {
__u32 cpu_id_start;
__u32 cpu_id;
struct rseq_cs *rseq_cs;
enum rseq_cs_flags flags;
}
</pre>

From the userspace side, we need to keep <code>struct rseq</code> somewhere and register it on the kernel side using the <code>rseq</code> syscall.
Then, once we want to execute some code as a rseq critical section (<code>rseq cs</code> or just CS) we need to allocate and fill with the data
<code>struct rseq_cs</code>. We have to specify the start address of our CS, and the address of the abort handler (called when CS was interrupted by a preemption, migration
or signal). Then we need to put an pointer to <code>struct rseq_cs</code> to the <code>(struct rseq).rseq_cs</code> field.

== What about <code>flags</code>? ==

You may have noticed that both <code>struct rseq</code> and <code>struct rseq_cs</code> have <code>flags</code> field. It may took values from <code>enum rseq_cs_flags</code>.

First of all, a user may specify flags in any place they will be combined on the kernel side:
<pre>
static int rseq_need_restart(struct task_struct *t, u32 cs_flags)
{
u32 flags, event_mask;
int ret;

/* Get thread flags. */
ret = get_user(flags, &t->rseq->flags);
if (ret)
return ret;

/* Take critical section flags into account. */
flags |= cs_flags; // <<<<<<<< here we have flags combined from struct rseq + struct rseq_cs
</pre>

The most common <code>flags</code> value is zero. In this case, the rseq CS will be interrupted (and IP will be fixed up to the abort handler)
if preemption, migration, or signal occurs. But there are situations when users may want not to abort section once one of these events happen.

It's worth mentioning that <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> can be used only in combination with <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> and <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>:
<pre>
/*
* Restart on signal can only be inhibited when restart on
* preempt and restart on migrate are inhibited too. Otherwise,
* a preempted signal handler could fail to restart the prior
* execution context on sigreturn.
*/
if (unlikely((flags & RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL) &&
(flags & RSEQ_CS_PREEMPT_MIGRATE_FLAGS) !=
RSEQ_CS_PREEMPT_MIGRATE_FLAGS))
return -EINVAL;
</pre>

== How CRIU handles rseq ==

CRIU handles the rseq differently depending on the particular case. Let's classify and cover all of them.

# the process is not inside the rseq critical section
# the process is inside the rseq CS
## <code>flags</code> is <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>
## <code>flags</code> is <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code>

=== the process is not inside the rseq critical section ===

Simplest case. Process just have <code>struct rseq</code> registered in the kernel but currently instruction pointer (IP) not inside CS.

==== Dump ====
We need only to determine where the <code>struct rseq</code> is and dump its address length and signature.
To achieve that we use special ptrace handle <code>PTRACE_GET_RSEQ_CONFIGURATION</code> (refer to the <code>dump_thread_rseq</code> function).

==== Restore ====
We need to take data about the <code>struct rseq</code> from the image (see images/rseq.proto) and register it from the parasite context using the <code>rseq</code> syscall (take a look on <code>restore_rseq</code> in criu/pie/restorer.c)

=== inside CS: <code>flags</code> is <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code> ===

The process was caught with IP inside CS. Can we act as before? So, dump <code>struct rseq</code> address, restore it, and so on. No, we can't.
The reason is that CRIU saves IP as it was during the dump. But the rseq semantic is to jump to abort handler if CS execution was interrupted.
In this particular case we have <code>flags</code> equal to <code>0</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> or <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>
it means that if CS will be interrupted by the preeption, migration (<code>0</code>) or migration (<code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code>) or preemption (<code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>)
the kernel will fixup IP of the process to the abort handler address.

When we dump the process using CRIU it will just save IP as it was and restore it. That's a serious problem and this may break the user application (even cause crash!).

Lets see <code>fixup_thread_rseq</code> function:
<pre>
if (task_in_rseq(rseq_cs, TI_IP(core))) {
struct pid *tid = &item->threads[i];

...

pr_warn("The %d task is in rseq critical section. IP will be set to rseq abort handler addr\n",
tid->real);

...

if (!(rseq_cs->flags & RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL)) {
pr_warn("The %d task is in rseq critical section. IP will be set to rseq abort handler addr\n",
tid->real);

TI_IP(core) = rseq_cs->abort_ip;

if (item->pid->real == tid->real) {
compel_set_leader_ip(dmpi(item)->parasite_ctl, rseq_cs->abort_ip);
} else {
compel_set_thread_ip(dmpi(item)->thread_ctls[i], rseq_cs->abort_ip);
}
}
}
</pre>

It checks that process IP inside CS and fixes it up to the abort handler IP as the kernel does.

==== Dump ====

==== Restore ====

=== inside CS: <code>flags</code> is <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> ===

Rare case, but we support it too.

==== Dump ====

==== Restore ====

== TODO ==

* tests for all architectures (right now we have ZDTM tests only for x86_64)
* improvement support of built-in rseq for non-Glibc libraries
* pre-dump tests (?)
* leave-running tests (?)
* crfail test
* threaded test

== Useful links ==

* [1] https://github.com/torvalds/linux/blob/b2d229d4ddb17db541098b83524d901257e93845/kernel/rseq.c#L1
* [2] https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/
* [3] https://lwn.net/Articles/883104/

[[Category: Under the hood]]
[[Category: Editor help needed]]

Restartable Sequences

2022-04-18T08:26:48Z

Amikhalitsyn:

Restartable Sequences

2022-04-18T08:21:31Z

Amikhalitsyn: draft of the article about rseq support in CRIU

"Restartable sequences" (<code>rseq</code>) are small segments of user-space code designed to access per-CPU data structures without the need for heavyweight locking.
rseq is supported since Linux kernel 4.18 [1]

== Linux kernel interface ==

The Linux kernel interface for rseq is fairly simple. It's just <code>rseq</code> syscall:
<code>sys_rseq(struct rseq *rseq, uint32_t rseq_len, int flags, uint32_t sig)</code>

<pre>
enum rseq_cs_flags {
RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT),
RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE = (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
};

struct rseq_cs {
__u32 version; /* always 0 at this moment */
enum rseq_cs_flags flags;
void *start_ip;
/* Offset from start_ip. */
intptr_t post_commit_offset;
void *abort_ip;
}

struct rseq {
__u32 cpu_id_start;
__u32 cpu_id;
struct rseq_cs *rseq_cs;
enum rseq_cs_flags flags;
}
</pre>

From the userspace side, we need to keep <code>struct rseq</code> somewhere and register it on the kernel side using the <code>rseq</code> syscall.
Then, once we want to execute some code as a rseq critical section (<code>rseq cs</code> or just CS) we need to allocate and fill with the data
<code>struct rseq_cs</code>. We have to specify the start address of our CS, and the address of the abort handler (called when CS was interrupted by a preemption, migration
or signal). Then we need to put an pointer to <code>struct rseq_cs</code> to the <code>(struct rseq).rseq_cs</code> field.

== What about <code>flags</code>? ==

You may have noticed that both <code>struct rseq</code> and <code>struct rseq_cs</code> have <code>flags</code> field. It may took values from <code>enum rseq_cs_flags</code>.

First of all, a user may specify flags in any place they will be combined on the kernel side:
<pre>
static int rseq_need_restart(struct task_struct *t, u32 cs_flags)
{
u32 flags, event_mask;
int ret;

/* Get thread flags. */
ret = get_user(flags, &t->rseq->flags);
if (ret)
return ret;

/* Take critical section flags into account. */
flags |= cs_flags; // <<<<<<<< here we have flags combined from struct rseq + struct rseq_cs
</pre>

The most common <code>flags</code> value is zero. In this case, the rseq CS will be interrupted (and IP will be fixed up to the abort handler)
if preemption, migration, or signal occurs. But there are situations when users may want not to abort section once one of these events happen.

It's worth mentioning that <code>RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL</code> can be used only in combination with <code>RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT</code> and <code>RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE</code>:
<pre>
/*
* Restart on signal can only be inhibited when restart on
* preempt and restart on migrate are inhibited too. Otherwise,
* a preempted signal handler could fail to restart the prior
* execution context on sigreturn.
*/
if (unlikely((flags & RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL) &&
(flags & RSEQ_CS_PREEMPT_MIGRATE_FLAGS) !=
RSEQ_CS_PREEMPT_MIGRATE_FLAGS))
return -EINVAL;
</pre>

== Useful links ==

* [1] https://github.com/torvalds/linux/blob/b2d229d4ddb17db541098b83524d901257e93845/kernel/rseq.c#L1
* [2] https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/
* [3] https://lwn.net/Articles/883104/

[[Category: Under the hood]]
[[Category: Editor help needed]]

Google Summer of Code Ideas

2022-04-05T21:11:40Z

Amikhalitsyn: /* Project ideas */ added memfd_secret project

Google Summer of Code Ideas

2022-04-05T21:04:15Z

Amikhalitsyn: /* Suspended project ideas */ move io_uring

Google Summer of Code Ideas

2022-04-05T21:03:48Z

Amikhalitsyn: /* Project ideas */ remove io_uring as it's in progress by Kumar

Google Summer of Code Ideas

2022-02-21T09:09:39Z

Amikhalitsyn: /* Project ideas */

Google Summer of Code Ideas

2022-02-21T08:46:03Z

Amikhalitsyn: /* Project ideas */ added project idea about pidfd

Linux kernel

2021-04-20T08:55:00Z

Amikhalitsyn:

Most likely the first thing to enable is the <code>CONFIG_EXPERT=y</code> (General setup -> Configure standard kernel features (expert users)) option, which on x86_64 depends on the <code>CONFIG_EMBEDDED=y</code> (General setup -> Embedded system) one (welcome to Kconfig reverse chains hell).

The following options must be enabled for CRIU to work:

* ''General setup'' options
** <code>CONFIG_CHECKPOINT_RESTORE=y</code> (Checkpoint/restore support)
** <code>CONFIG_NAMESPACES=y</code> (Namespaces support)
** <code>CONFIG_UTS_NS=y</code> (Namespaces support -> UTS namespace)
** <code>CONFIG_IPC_NS=y</code> (Namespaces support -> IPC namespace)
** <code>CONFIG_SYSVIPC_SYSCTL=y</code>
** <code>CONFIG_PID_NS=y</code> (Namespaces support -> PID namespaces)
** <code>CONFIG_NET_NS=y</code> (Namespaces support -> Network namespace)
** <code>CONFIG_FHANDLE=y</code> (Open by fhandle syscalls)
** <code>CONFIG_EVENTFD=y</code> (Enable eventfd() system call)
** <code>CONFIG_EPOLL=y</code> (Enable eventpoll support)
* ''Networking support -> Networking options'' options for sock-diag subsystem
** <code>CONFIG_UNIX_DIAG=y</code> (Unix domain sockets -> UNIX: socket monitoring interface)
** <code>CONFIG_INET_DIAG=y</code> (TCP/IP networking -> INET: socket monitoring interface)
** <code>CONFIG_INET_UDP_DIAG=y</code> (TCP/IP networking -> INET: socket monitoring interface -> UDP: socket monitoring interface)
** <code>CONFIG_PACKET_DIAG=y</code> (Packet socket -> Packet: sockets monitoring interface)
** <code>CONFIG_NETLINK_DIAG=y</code> (Netlink socket -> Netlink: sockets monitoring interface)
* <code>CONFIG_NETFILTER_XT_MARK=y</code> (Networking support -> Networking options -> Network packet filtering framework (Netfilter) -> Core Netfilter Configuration -> Netfilter Xtables support (required for ip_tables) -> nfmark target and match support)
* <code>CONFIG_TUN=y</code> (Networking support -> Universal TUN/TAP device driver support)

Other options not required by CRIU, but C/R supported ([[ZDTM test suite]] may fail without them):
* <code>CONFIG_INOTIFY_USER=y</code> (File systems -> Inotify support for userspace)
* <code>CONFIG_FANOTIFY=y</code> (File systems -> Filesystem wide access notification)
* <code>CONFIG_MEMCG=y</code> (General setup -> Control Group support -> Memory controller)
* <code>CONFIG_CGROUP_DEVICE=y</code> (General setup -> Control Group support -> Device controller)
* <code>CONFIG_MACVLAN=y</code> (Device Drivers -> Network device support -> Network core driver support -> MAC-VLAN support)
* <code>CONFIG_BRIDGE=y</code> (Networking support -> Networking options -> 802.1d Ethernet Bridging)
* <code>CONFIG_BINFMT_MISC=y</code> (Userspace binary formats -> Kernel support for MISC binaries)
* <code>CONFIG_IA32_EMULATION=y</code> (x86 only) (Executable file formats -> Emulations -> IA32 Emulation)

For some [[usage scenarios]] there is an ability to track memory changes and produce [[incremental dumps]]. Need to enable the <code>CONFIG_MEM_SOFT_DIRTY=y</code> (optional) (Processor type and features -> Track memory changes). In order to enable [[lazy migration]], the [[userfaultfd]] system call is required <code>CONFIG_USERFAULTFD=y</code> (optional) (General setup -> Enable userfaultfd() system call).

In the beginning of the project we had our [[custom kernel]], which contained some experimental CRIU related patches. Nowadays this is almost not used.

[[Category: Building]]

Linux kernel

2021-04-20T08:54:43Z

Amikhalitsyn: added CONFIG_SYSVIPC_SYSCTL

Most likely the first thing to enable is the <code>CONFIG_EXPERT=y</code> (General setup -> Configure standard kernel features (expert users)) option, which on x86_64 depends on the <code>CONFIG_EMBEDDED=y</code> (General setup -> Embedded system) one (welcome to Kconfig reverse chains hell).

The following options must be enabled for CRIU to work:

* ''General setup'' options
** <code>CONFIG_CHECKPOINT_RESTORE=y</code> (Checkpoint/restore support)
** <code>CONFIG_NAMESPACES=y</code> (Namespaces support)
** <code>CONFIG_UTS_NS=y</code> (Namespaces support -> UTS namespace)
** <code>CONFIG_IPC_NS=y</code> (Namespaces support -> IPC namespace)
** <code>CONFIG_SYSVIPC_SYSCTL</code>
** <code>CONFIG_PID_NS=y</code> (Namespaces support -> PID namespaces)
** <code>CONFIG_NET_NS=y</code> (Namespaces support -> Network namespace)
** <code>CONFIG_FHANDLE=y</code> (Open by fhandle syscalls)
** <code>CONFIG_EVENTFD=y</code> (Enable eventfd() system call)
** <code>CONFIG_EPOLL=y</code> (Enable eventpoll support)
* ''Networking support -> Networking options'' options for sock-diag subsystem
** <code>CONFIG_UNIX_DIAG=y</code> (Unix domain sockets -> UNIX: socket monitoring interface)
** <code>CONFIG_INET_DIAG=y</code> (TCP/IP networking -> INET: socket monitoring interface)
** <code>CONFIG_INET_UDP_DIAG=y</code> (TCP/IP networking -> INET: socket monitoring interface -> UDP: socket monitoring interface)
** <code>CONFIG_PACKET_DIAG=y</code> (Packet socket -> Packet: sockets monitoring interface)
** <code>CONFIG_NETLINK_DIAG=y</code> (Netlink socket -> Netlink: sockets monitoring interface)
* <code>CONFIG_NETFILTER_XT_MARK=y</code> (Networking support -> Networking options -> Network packet filtering framework (Netfilter) -> Core Netfilter Configuration -> Netfilter Xtables support (required for ip_tables) -> nfmark target and match support)
* <code>CONFIG_TUN=y</code> (Networking support -> Universal TUN/TAP device driver support)

Other options not required by CRIU, but C/R supported ([[ZDTM test suite]] may fail without them):
* <code>CONFIG_INOTIFY_USER=y</code> (File systems -> Inotify support for userspace)
* <code>CONFIG_FANOTIFY=y</code> (File systems -> Filesystem wide access notification)
* <code>CONFIG_MEMCG=y</code> (General setup -> Control Group support -> Memory controller)
* <code>CONFIG_CGROUP_DEVICE=y</code> (General setup -> Control Group support -> Device controller)
* <code>CONFIG_MACVLAN=y</code> (Device Drivers -> Network device support -> Network core driver support -> MAC-VLAN support)
* <code>CONFIG_BRIDGE=y</code> (Networking support -> Networking options -> 802.1d Ethernet Bridging)
* <code>CONFIG_BINFMT_MISC=y</code> (Userspace binary formats -> Kernel support for MISC binaries)
* <code>CONFIG_IA32_EMULATION=y</code> (x86 only) (Executable file formats -> Emulations -> IA32 Emulation)

For some [[usage scenarios]] there is an ability to track memory changes and produce [[incremental dumps]]. Need to enable the <code>CONFIG_MEM_SOFT_DIRTY=y</code> (optional) (Processor type and features -> Track memory changes). In order to enable [[lazy migration]], the [[userfaultfd]] system call is required <code>CONFIG_USERFAULTFD=y</code> (optional) (General setup -> Enable userfaultfd() system call).

In the beginning of the project we had our [[custom kernel]], which contained some experimental CRIU related patches. Nowadays this is almost not used.

[[Category: Building]]

Installation

2020-10-07T09:07:47Z

Amikhalitsyn: /* Other stuff */ added info about nftables dependency

<code>criu</code> is an utility to checkpoint/restore a process tree. This page describes how to get CRIU binary on your box.

== Installing from packages ==

Many distributions provide ready-to-use [[packages]]. If no, or the CRIU version you want is not yet there, you will need to get CRIU sources and compile it.

== Obtaining CRIU sources ==

You can download the source code as a [https://download.openvz.org/criu/ release tarball] or sync the [https://github.com/checkpoint-restore/criu git repository]. If you plan to modify CRIU sources (e.g. to [[How to submit patches|contribute the code back]]) the latter way is highly recommended. The latest and greatest sources are: {{Latest release}}

== Installing build dependencies ==

=== Compiler and C Library ===

CRIU is mostly written in C and the build system is based on Makefiles. Thus just install standard <code>gcc</code> and <code>make</code> packages (on Debian use <code>[https://packages.debian.org/build-essential build-essential]</code>).

For building with [[32bit tasks C/R]] support you will need <code>libc6-dev-i386, gcc-multilib</code> instead of <code>gcc</code>.

[[ARM crosscompile|Cross-compilation for ARM]] is also possible.

=== Protocol Buffers ===

CRIU uses the [https://developers.google.com/protocol-buffers/ Google Protocol Buffers] to read and write [[images]]. The <code>protoc</code> tool is used at build time and CRIU is linked with the <code>libprotobuf-c.so</code>. Also [[CRIT]] uses python bindings and the <code>descriptor.proto</code> file which typically provided by a distribution's protobuf development package.

; RPM packages
: <code>protobuf protobuf-c protobuf-c-devel protobuf-compiler protobuf-devel protobuf-python</code>

; Deb packages
: <code>libprotobuf-dev libprotobuf-c0-dev protobuf-c-compiler protobuf-compiler python-protobuf</code>

Optionally, you may [[build protobuf]] from sources.

=== Other stuff ===

* <code>pkg-config</code> to check on build library dependencies.
* <code>python-ipaddress</code> is used by CRIT to pretty-print IP addresses and is also required by zdtm.py
* <code>libbsd-devel</code> (RPM) / <code>libbsd-dev</code> (DEB) If available, CRIU will be compiled with <code>setproctitle()</code> support and set verbose process titles on service workers.
* <code>iproute2</code> version 3.5.0 or higher is needed for dumping network namespaces. The latest one can be cloned from [http://git.kernel.org/?p=linux/kernel/git/shemminger/iproute2.git;a=summary iproute2]. It should be compiled and a path to ip set as the [[environment variables|<code>CR_IP_TOOL</code> variable]]
* <code>nftables</code> (RPM) / <code>libnftables-dev</code> (DEB) If available, CRIU will be compiled with nftables C/R support
* <code>libcap-devel</code> (RPM) / <code>libcap-dev</code> (DEB)
* <code>libnet-devel libnl3-devel</code> (RPM) / <code>libnet1-dev</code> (DEB) / <code>libnl-3-dev libnet-dev</code> (Ubuntu)
* <code>libaio-devel</code> (RPM) / <code>libaio-dev</code> (DEB) is needed to run tests
* <code>python2-future</code> or <code>python3-future</code> is now needed for zdtm.py tests launcher

For APT use the <code>--no-install-recommends</code> parameter is to avoid asciidoc pulling in a lot of dependencies.
Also read about [[ZDTM test suite]] if you will run CRIU tests, those sources need other deps.

== Building the tool ==

Simply run <code>make</code> in the CRIU source directory. This is the standard way, but there are some options available.

# There's a ''docker-build'' target in Makefile which builds CRIU in Ubuntu Docker container. Just run <code>make docker-build</code> and that's it.
# CRIU has functionality that is either optional or behaves differently depending on the kernel CRIU is running on. By default build process includes maximum of it, but this behavior [[configuring|can be changed]].
# You may [[Manual build deps|specify build dependencies by hands]]

== Installing ==

CRIU works perfectly even when run from the sources directory (with the <code>./criu/criu</code> command), but if you want to have in standard paths run <code>make install</code>. You may need to install <code>asciidoc</code> and <code>xmlto</code> packages to make install-man work.

== Checking That It Works ==

Linux kernel v3.11 or newer is required, with some specific config options turned on. Various advanced CRIU features might require even newer kernel. So the first thing to do is to [[Checking the kernel|check the kernel]] by running <code>criu check</code>. At the end it should say "Looks OK", if it doesn't the messages on the screen explain what functionality is missing. If your distribution does not provide needed kernel, you might want to [[Linux kernel|compile one yourself]].

You can then try running the [[ZDTM Test Suite]] which sits in the <code>tests/zdtm/</code> directory.

== Further reading ==

* [[Usage]]
* [[Advanced usage]]
* [[:Category:HOWTO]]

[[Category:HOWTO]]
[[Category:Editor help needed]]

News/events

2020-08-25T17:05:36Z

Amikhalitsyn: /* Linux Plumbers Conference 2020 */

<noinclude> __NOTOC__


This page collects into about events criu takes part in.

<startFeed/></noinclude>
== Linux Plumbers Conference 2020 ==
[[Image:Linuxplumbers.png|left|100px|link=]]

'''August 24-28, online at https://meet.2020.linuxplumbersconf.org/'''

[https://linuxplumbersconf.org/event/7/contributions/641/ Fast checkpointing with criu-image-streamer]
[https://youtu.be/fSyr_IXM21Y?t=7762 YouTube]

[https://linuxplumbersconf.org/event/7/contributions/642/ FastFreeze: Unprivileged checkpoint/restore for containerized applications]
[https://youtu.be/fSyr_IXM21Y?t=2654 YouTube]

[https://linuxplumbersconf.org/event/7/contributions/640/ CRIU mounts migration: problems and solutions]
[https://youtu.be/fSyr_IXM21Y?t=1593 YouTube]

[https://linuxplumbersconf.org/event/7/contributions/643/ Checkpoint-restoring containers with Docker inside]
[https://youtu.be/fSyr_IXM21Y?t=6043 YouTube]
<br clear="both">

== Phoronix news ==
[[Image:phoronix.png|left|100px|link=]]

'''August 4, 2020, online'''

[https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.9-Checkpoint-Restore Checkpoint/Restore Of Unprivileged Processes Sent In For Linux 5.9]
<br clear="both">

== FOSDEM 2020 ==
[[Image:Fosdem.png|left|100px|link=]]

'''Feburary 1, 2020, Brussels, Belgium'''

[https://fosdem.org/2020/schedule/event/containers_live_migration/ Container Live Migration]

--[[User:Rppt]] 19:22, 28 February 2020‎ (UTC)
<br clear="both">

== Linux Plumbers Conference 2019 ==
[[Image:Linuxplumbers.png|left|100px|link=]]

'''September 9-11, Lisbon, Portugal'''

[https://linuxplumbersconf.org/event/4/contributions/508/ Update on Task Migration at Google Using CRIU]

[https://linuxplumbersconf.org/event/4/contributions/472/ CRIU and the PID dance]

[https://linuxplumbersconf.org/event/4/contributions/513/ CRIU: Reworking vDSO proxification, syscall restart]

[https://linuxplumbersconf.org/event/4/contributions/478/ Secure Image-less Container Migration]

--[[User:Avagin]] 14:05, 23 Aug 2019 (UTC)
<br clear="both">

== Google Summer of Code 2019 ==
[[Image:gsoc.png|left|100px|link=]]

'''Mar-Sep 2019'''

[https://summerofcode.withgoogle.com/organizations/6333591376625664/ Google Summer of Code]

--[[User:Avagin]] 21:32, 26 Feb 2019 (PST)
<br clear="both">

<noinclude><endFeed/>

== See also ==
* [[News/events/past|Past events]]

</noinclude>

News/events

2020-08-25T16:59:02Z

Amikhalitsyn: added information about LPC 2020

<noinclude> __NOTOC__


This page collects into about events criu takes part in.

<startFeed/></noinclude>
== Linux Plumbers Conference 2020 ==
[[Image:Linuxplumbers.png|left|100px|link=]]

'''August 24-28, online at https://meet.2020.linuxplumbersconf.org/'''

[https://linuxplumbersconf.org/event/7/contributions/641/ Fast checkpointing with criu-image-streamer]

[https://linuxplumbersconf.org/event/7/contributions/642/ FastFreeze: Unprivileged checkpoint/restore for containerized applications]

[https://linuxplumbersconf.org/event/7/contributions/640/ CRIU mounts migration: problems and solutions]

[https://linuxplumbersconf.org/event/7/contributions/643/ Checkpoint-restoring containers with Docker inside]
<br clear="both">

== Phoronix news ==
[[Image:phoronix.png|left|100px|link=]]

'''August 4, 2020, online'''

[https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.9-Checkpoint-Restore Checkpoint/Restore Of Unprivileged Processes Sent In For Linux 5.9]
<br clear="both">

== FOSDEM 2020 ==
[[Image:Fosdem.png|left|100px|link=]]

'''Feburary 1, 2020, Brussels, Belgium'''

[https://fosdem.org/2020/schedule/event/containers_live_migration/ Container Live Migration]

--[[User:Rppt]] 19:22, 28 February 2020‎ (UTC)
<br clear="both">

== Linux Plumbers Conference 2019 ==
[[Image:Linuxplumbers.png|left|100px|link=]]

'''September 9-11, Lisbon, Portugal'''

[https://linuxplumbersconf.org/event/4/contributions/508/ Update on Task Migration at Google Using CRIU]

[https://linuxplumbersconf.org/event/4/contributions/472/ CRIU and the PID dance]

[https://linuxplumbersconf.org/event/4/contributions/513/ CRIU: Reworking vDSO proxification, syscall restart]

[https://linuxplumbersconf.org/event/4/contributions/478/ Secure Image-less Container Migration]

--[[User:Avagin]] 14:05, 23 Aug 2019 (UTC)
<br clear="both">

== Google Summer of Code 2019 ==
[[Image:gsoc.png|left|100px|link=]]

'''Mar-Sep 2019'''

[https://summerofcode.withgoogle.com/organizations/6333591376625664/ Google Summer of Code]

--[[User:Avagin]] 21:32, 26 Feb 2019 (PST)
<br clear="both">

<noinclude><endFeed/>

== See also ==
* [[News/events/past|Past events]]

</noinclude>

Google Summer of Code Ideas

2020-03-04T12:16:29Z

Amikhalitsyn: /* Project ideas */

Google Summer of Code Ideas

2020-02-28T07:49:18Z

Amikhalitsyn: /* Add support for SPFS */

Contacts

2020-02-25T13:00:48Z

Amikhalitsyn: added info about GitHub organization and Gitter

There are many ways to contact CRIU community. This page contains official accounts in social networks and another points of connect.

* [https://github.com/checkpoint-restore GitHub checkpoint-restore project]
* [https://gitter.im/save-restore/CRIU Gitter]
* [https://twitter.com/__criu__ CRIU twitter]
* [https://www.youtube.com/channel/UCeXb0oWYd7ZE-44TrTSWxmg Youtube channel]
* [https://lists.openvz.org/mailman/listinfo/criu Mailing list]
* IRC channels on Freenode:
** [https://webchat.freenode.net/?channels=#criu #criu] - developers talks. Logs are [https://botbot.me/freenode/criu/ available].
** [https://webchat.freenode.net/?channels=#criu-commit-bot #criu-commit-bot] - commits to CRIU source code repository
** [https://webchat.freenode.net/?channels=#criu-ci #criu-ci] - status of CI jobs from Jenkins

== See also ==

* [https://openvz.org/Contacts OpenVZ contacts]

[[Category: Communication]]

GSoC Students Recommendations

2020-02-25T12:58:54Z

Amikhalitsyn: /* Contacts */

[[Category:GSoC]]

== Contacts ==

The entry points for the community is the [https://github.com/checkpoint-restore GitHub checkpoint-restore project], [https://gitter.im/save-restore/CRIU Gitter] and <code>criu@openvz.org</code> mailing list. Also, the [[Google Summer of Code Ideas|ideas]] page contains mentors' personal e-mails for each sub-project.

== Takeoff ==

Starting playing with CRIU is as simple as one, two, three:

# Get the sources from [https://github.com/checkpoint-restore/criu]
# Build them with <code>make</code>
# Do your first C/R by running a simple test with <code>zdtm.py run -t static/env00</code>

Here are links for further reading

* [[Installation]]
* [[CLI]]
* [[Simple loop]]
* [[ZDTM test suite]]

== Contributing ==

When a new patch is ready, it can be submitted for merging either [[How_to_submit_patches|via the CRIU mailing list]] (recommended) or via github PR.