Google Summer of Code Ideas
Google Summer of Code (GSoC) is a global program that offers post-secondary students an opportunity to be paid for contributing to an open source project over a three month period.
This page contains project ideas for upcoming Google Summer of Code.
Contacts
Please contact the respective mentor for the idea you are interested in. For general questions feel free to send an email to the mailing list or write in gitter.
Project ideas
Support sparse ghosts
When criu dumps processes it also dumps files that are opened by them. It does this by saving file names by which the files are accessible. But sometimes files can have no names. It may happen if a task opened a file and then removed it. To dump this file criu cannot save its name (because the name doesn't exist). Instead criu saves the whole file. This is called "ghost file". Since saving the whole file is very expensive (copying lots of data on disk) criu limits the maximum size of a ghost file. The latter is also not good, because there are "sparse" files, that are large in size, but may be small from the real disk usage perspective. The goal of the task is to support sparse ghost files, i.e. limit the size of the ghost not by its length but by disk usage and when copying the data detect the used blocks and save only those.
Links:
Details:
- Skill level: intermediate
- Language: C
- Expected size: 350 hours
- Mentor: Pavel Emelyanov <ovzxemul@gmail.com>
- Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
- Suggested by: Pavel Emelyanov <ovzxemul@gmail.com>
Optimize logging engine
Summary: CRIU puts a lots of logs when doing its job. Logging is done with simple fprintf function. They are typically useless, but if some operation fails -- the logs are the only way to find what was the reason for failure.
At the same time the printf family of functions is known to take some time to work -- they need to scan the format string for %-s and then convert the arguments into strings. If comparing criu dump with and without logs the time difference is notable (15%-20%), so speeding the logs up will help improve criu performance.
One of the solutions to the problem might be binary logging. The problem with binary logs is the amount of efforts to convert existing logs to binary form. Preferably, the switch to binary logging either keeps existing log() calls intact, either has some automatics to convert them.
The option to keep log() calls intact might be in pre-compilation pass of the sources. In this pass each log(fmt, ...)
call gets translated into a call to a binary log function that saves fmt
identifier copies all the args as is into the log file. The binary log decode utility, required in this case, should then find the fmt string by its ID in the log file and print the resulting message.
Links:
Details:
- Skill level: intermediate
- Language: C, though decoder/preprocessor can be in any language
- Expected size: 350 hours
- Mentor: Pavel Emelyanov <ovzxemul@gmail.com>
- Suggested by: Andrei Vagin <avagin@gmail.com>
Add support for checkpoint/restore of CORK-ed UDP socket
Summary: Support C/R of corked UDP socket
There's UDP_CORK option for sockets. As man page says:
If this option is enabled, then all data output on this socket is accumulated into a single datagram that is transmitted when the option is disabled. This option should not be used in code intended to be portable.
Currently criu refuses to dump this case, so it's effectively a bug. Supporting this will need extending the kernel API to allow criu read back the write queue of the socket (see how it's done for TCP sockets, for example). Then the queue is written into the image and is restored into the socket (with the CORK bit set too).
Links:
Details:
- Skill level: intermediate (+linux kernel)
- Language: C
- Expected size: 350 hours
- Mentor: Pavel Emelianov <ovzxemul@gmail.com>
- Suggested by: Pavel Emelianov <ovzxemul@gmail.com>
Add support for pidfd file descriptors
Summary: Support C/R of pidfd descriptors
There is pidfd_open syscall which allows opening a special PID file descriptor. A user can send a signal to the process (pidfd_send_signal syscall), wait for the process (poll() on pidfd).
At the moment CRIU can't dump processes that have pidfd's opened.
Links:
- https://lwn.net/Articles/801319/
- https://lwn.net/Articles/794707/
- https://github.com/torvalds/linux/blob/v5.16/kernel/fork.c#L1877
Details:
- Skill level: intermediate
- Language: C
- Expected size: 350 hours
- Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Christian Brauner <christian@brauner.io>
- Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Add support for memfd_secret file descriptors
Summary: Support C/R of memfd_secret descriptors
There is memfd_secret syscall which allows user to open special memfd which is backed by special memory range which is inaccessible by another processes (and the kernel too!).
At the moment CRIU can't dump processes that have memfd_secret's opened.
Links:
Details:
- Skill level: intermediate
- Language: C
- Expected size: 350 hours
- Mentors: Alexander Mikhalitsyn <alexander@mihalicyn.com>, Mike Rapoport <mike.rapoport@gmail.com>
- Suggested by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Use eBPF to lock and unlock the network
Summary: Use eBPF instead of external iptables-restore tool for network lock and unlock.
During checkpointing and restoring CRIU locks the network to make sure no network packets are accepted by the network stack during the time the process is checkpointed. Currently CRIU calls out to iptables-restore to create and delete the corresponding iptables rules. Another approach which avoids calling out to the external binary iptables-restore would be to directly inject eBPF rules. There have been reports from users that iptables-restore fails in some way and eBPF could avoid this external dependency.
Links:
- https://www.criu.org/TCP_connection#Checkpoint_and_restore_TCP_connection
- https://github.com/systemd/systemd/blob/master/src/core/bpf-firewall.c
- https://blog.zeyady.com/2021-08-16/gsoc-criu
Details:
- Skill level: intermediate
- Language: C
- Expected size: 350 hours
- Mentor: Radostin Stoyanov <rstoyanov@fedoraproject.org>
- Suggested by: Adrian Reber <areber@redhat.com>
CGroup-v2 support
Summary: cgroup is a mechanism to organize processes hierarchically and distribute system resources along the hierarchy in a controlled and configurable manner. cgroup v2 is a new version of the cgroup file system. Unlike v1, cgroup v2 has only single hierarchy. CRIU has to dump/restore a container cgroup hierarchy along with all per-cgroup options. The cgroupv2 support in CRIU has to be compatible with Docker, containerd and cri-o.
Links:
- CGroups
- https://github.com/checkpoint-restore/criu/issues/252
- https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
Details:
- Skill level: intermediate
- Language: C
- Expected size: 350 hours
- Mentor: Andrei Vagin <avagin@gmail.com>
- Suggested by: Andrei Vagin <avagin@gmail.com>
Dump shmem in user-mode (unprivileged-mode)
CRIU uses /proc/pid/map_files to dump and restore anonymous shared memory regions, but map_files is restricted to the global CAP_SYS_ADMIN capability. In most cases, it is possible to dump/restore shared memory region without map_files and we need to implement this in CRIU.
Links:
Details:
- Skill level: intermediate
- Language: C
- Expected size: 350 hours
- Suggested by: Andrei Vagin <avagin@gmail.com>
- Suggested by: Pavel Emelyanov <ovzxemul@gmail.com>
- Mentor: Pavel Emelyanov <ovzxemul@gmail.com>
Files on detached mounts
Summary: Initial support of open files on "detached" mounts
When criu dumps a process with an open fd on a file, it gets the mount identifier (mnt_id) via /proc/<pid>/fdinfo/<fd>, so that criu knows from which exact mount the file was initially opened. This way criu can restore this fd by opening the same exact file from topologically the same mount in restored mount tree.
Restoring fd from the right mount can be important in different cases, for instance if the process would later want to resolve paths relative to the fd, and obviously resolving from the same file on different mount can lead to different resolved paths, or if the process wants to check path to the file via /proc/<pid>/fd/<fd>.
But we have a problem finding on which mount we need to reopen the file at restore if we only know mnt_id but can't find this mnt_id in /proc/<pid>/mountinfo.
Mountinfo file shows the mount tree topology of current mntns: parent - child relations, sharing group information, mountpoint and fs root information. And if we don't see mnt_id in it we don't know anything about this mount.
This can happen in two cases
- 1) external mount or file - if file was opened from e.g. host it's mount would not be visible in container mountinfo
- 2) mount was lazily unmounted
In case of 1) we have criu options to help criu handle external dependencies.
In case of 2) or no options provided criu can't resolve mnt_id in mountinfo and criu fails.
Solution: We can handle 2) with: resolving major/minor via fstat, using name_to_handle_at and open_by_handle_at to open same file on any other available mount from same superblock (same major/minor) in container. Now we have fd2 of the same file as fd, but on existing mount we can dump it as usual instead, and mark it as "detached" in image, now criu on restore knows where to find this file, but instead of just opening fd2 from actually restored mount, we create a temporary bindmount which is lazy unmounted just after open making the file appear as a file on detached mount.
Known problems with this approach:
- Stat on btrfs gives wrong major/minor
- file handles does not work everywhere
- file handles can return fd2 on deleted file or on other hardlink, this needs special handling.
Additionally (optional part): We can export real major/minor in fdinfo (kernel). We can think of new kernel interface to get mount's major/minor and root (shift from fsroot) for detached mounts, if we have it we don't need file handle hack to find file on other mount (see fsinfo or getvalues kernel patches in LKML, can we add this info there?).
Details:
- Skill level: intermediate
- Language: C
- Mentor: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
- Suggested by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Suspended project ideas
Listed here are tasks that seem suitable for GSoC, but currently do not have anybody to mentor it.
IOUring support
The io_uring Asynchronous I/O (AIO) framework is a new Linux I/O interface, first introduced in upstream Linux kernel version 5.1 (March 2019). It provides a low-latency and feature-rich interface for applications that require AIO functionality.
Links:
- https://blogs.oracle.com/linux/an-introduction-to-the-io_uring-asynchronous-io-framework
- https://github.com/axboe/liburing
Details:
- Skill level: expert (+linux kernel)
- Expected size: 350 hours
- Suggested by: Pavel Emelyanov <ovzxemul@gmail.com>
- Mentor: Pavel Emelyanov <ovzxemul@gmail.com>
Add support for SPFS
Summary: The SPFS is a special filesystem that allows checkpoint and restore of such things as NFS and FUSE
NFS support is already implemented in Virtuozzo CRIU, but it's very beneficial to port it to mainline CRIU. The importaint part of it is the need to implement the integration of Stub-Proxy File System (SPFS) with LXC/yet_another_containers_environment.
Links
- https://github.com/checkpoint-restore/criu/issues/60
- https://github.com/checkpoint-restore/criu/issues/53
- https://github.com/skinsbursky/spfs
- https://patchwork.criu.org/series/137/
Details:
- Skill level: expert
- Language: C
- Mentor: Alexander Mikhalitsyn <alexander@mihalicyn.com> / <alexander.mikhalitsyn@virtuozzo.com>
- Suggested by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Anonymise image files
Summary: Teach CRIT to remove sensitive information from images
When reporting a BUG it may be not acceptable for the reporter to send us raw images, as they may contain sensitive data. Need to teach CRIT to "anonymise" images for publication.
List of data to shred:
- Memory contents. For the sake of investigation, all the memory contents can be just removed. Only the sizes of pages*.img files are enough.
- Paths to files. Here we should keep the paths relations to each other. The simplest way seem to be replacing file names with "random" (or sequential) strings, BUT (!) keeping an eye on making this mapping be 1:1. Note, that file paths may also sit in sk-unix.img.
- Registers.
- Process names. (But relations should be kept).
- Contents of streams, i.e. pipe/fifo data, sk-queue, tcp-stream, tty data.
- Ghost files.
- Tarballs with tmpfs-s.
- IP addresses in sk-inet-s, ip tool dumps and net*.img.
Links:
- Anonymize image files
- https://github.com/checkpoint-restore/criu/issues/360
- CRIT, Images
- External links to mailing lists or web sites
Details:
- Skill level: beginner
- Language: Python
- Mentor: Pavel Emelianov <xemul@virtuozzo.com>
- Suggested by: Pavel Emelianov <xemul@virtuozzo.com>