forked from systemd/systemd
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
16947 lines (13753 loc) · 899 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
systemd System and Service Manager
CHANGES WITH 255 in spe:
Announcements of Future Feature Removals and Incompatible Changes:
* Support for split-usr (/usr/ mounted separately during late boot,
instead of being mounted by the initrd before switching to the rootfs)
and unmerged-usr (parallel directories /bin/ and /usr/bin/, /lib/ and
/usr/lib/, …) has been removed. For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
* We intend to remove cgroup v1 support from a systemd release after
the end of 2023. If you run services that make explicit use of
cgroup v1 features (i.e. the "legacy hierarchy" with separate
hierarchies for each controller), please implement compatibility with
cgroup v2 (i.e. the "unified hierarchy") sooner rather than later.
Most of Linux userspace has been ported over already.
* Support for System V service scripts is now deprecated and will be
removed in a future release. Please make sure to update your software
*now* to include a native systemd unit file instead of a legacy
System V script to retain compatibility with future systemd releases.
* Support for the SystemdOptions EFI variable is deprecated.
'bootctl systemd-efi-options' will emit a warning when used. It seems
that this feature is little-used and it is better to use alternative
approaches like credentials and confexts. The plan is to drop support
altogether at a later point, but this might be revisited based on
user feedback.
* systemd-run's switch --expand-environment= which currently is disabled
by default when combined with --scope, will be changed in a future
release to be enabled by default.
* "systemctl switch-root" is now restricted to initrd transitions only.
Transitions between real systems should be done with "systemctl soft-reboot"
instead.
* The ip=off and ip=none kernel command line options interpreted by
systemd-network-generator will now result in IPv6RA + link-local
addressing to be disabled, too. Previously DHCP was turned off, but
IPv6RA and IPv6 link-local addressing was left enabled.
* The NAMING_BRIDGE_MULTIFUNCTION_SLOT naming scheme has been deprecated
and is now disabled.
Service Manager:
* The way services are spawned has been overhauled. Previously, a
process was forked that shared all of the manager's memory (via
copy-on-write) while doing all the required set ups (e.g.: mount
namespaces, CGroup configuration, etc.) before exec'ing the target
executable. This was problematic for various reasons: several glibc
APIs were called that are not supposed to be used after a fork but
before an exec, copy-on-write meant that if either process (the
manager or the child) touched a memory page a copy was triggered, and
also the memory footprint of the child process was that of the
manager but with the memory limits of the service. From this version
onward, the new process is spawned using CLONE_VM and CLONE_VFORK
semantics via posix_spawn(), and it immediately execs a new internal
binary, systemd-executor, that receives the configuration to apply
via memfd, and sets up the process before exec'ing the target
executable.
* Most of the internal process tracking is being changed to use PIDFDs
instead of PIDs when the kernel supports it, to improve robustness
and reliability.
* A new option SurviveFinalKillSignal= is now supported to configure a
unit to skip units on the final SIGTERM/SIGKILL spree on shutdown. This
is part of the required configuration to let a unit's processes survive
a soft-reboot operation without being interrupted.
* System extension images (sysext) can now set
EXTENSION_RELOAD_MANAGER=1 in their extension-release files to
automatically reload the service manager (PID 1) when
merging/refreshing/unmerging on boot. Generally, while this can be
used to ship services in system extension images it's recommended to
do that via portable services instead.
* The ExtensionImages= and ExtensionDirectories= options now support
confexts images/directories.
* A new option NFTSet= provides a method for integrating dynamic cgroup IDs
into firewall rules with NFT sets. The benefit of using this setting is to be
able to use control group as a selector in firewall rules easily and this in
turn allows more fine grained filtering. Also, NFT rules for cgroup matching
use numeric cgroup IDs, which change every time a service is restarted, making
them hard to use in a systemd environment.
* A new option CoredumpReceive= can be set for service and scope units,
together with Delegate=yes, to make systemd-coredump on the host
forward core files from processes crashed inside the delegated CGroup
subtree to systemd-coredump running in the container. This new option
is by default used by systemd-nspawn containers that use the "--boot"
switch, i.e. are fully booted up.
* A new ConditionSecurity=measured-uki option is now available, to ensure
a unit can only run when the system has been booted from a measured UKI.
* MemoryAvailable= now considers physical memory if there are no CGroup
memory limits set anywhere in the tree.
* The $USER environment variable is now always set for services, while
previously it was only set if User= was specified. A new option
SetLoginEnvironment= is now supported to determine whether to also set
$HOME, $LOGNAME and $SHELL.
* Socket units now support a new pair of
PollLimitBurst=/PollLimitInterval= options to configure a limit on
how often polling events on the file descriptors backing this unit
will be considered within a time window.
* Scope units can now be created passing PIDFDs instead of PIDs to select
the processes they should include.
* Sending SIGRTMIN+18 with 0x500 as sigqueue() value will now cause the
manager to dump the list of currently pending jobs.
* If the kernel supports MOVE_MOUNT_BENEATH, the systemctl and machinectl
bind and mount-image verbs will now cause the new mount to to replace
the old mount (if any), instead of overmounting it.
TPM2 Support + Disk Encryption & Authentication:
* systemd-cryptenroll now allows specifying a PCR bank and explicit hash
value in the --tpm2-pcrs= option.
* systemd-cryptenroll now allows specifying a TPM2 key handle to be used
instead of the default SRK via the new --tpm2-seal-key-handle= option.
* systemd-cryptsetup is now installed in /usr/bin/ and is no longer an
internal-only executable.
* The TPM2 Storage Root Key will now be set up, if not already present,
by a new systemd-tpm2-setup.service early boot service.
* The internal systemd-pcrphase executable has been renamed to
systemd-pcrextend.
* The systemd-pcrextend tool gained a new --pcr= switch to override
which PCR to measure into.
* systemd-pcrextend now exposes a Varlink interface at
io.systemd.PCRExtend that can be used to do measurements and event
logging on demand.
* TPM measurements are now also written to an event log at
/run/log/systemd/tpm2-measure.log, using a derivative of the TCG
Canonical Event Log format. Previously we'd only log them to the
journal, where they however were subject to rotation and similar.
* A new component "systemd-pcrlock" has been added that allows managing
local TPM2 PCR policies for PCRs 0-7 and similar, which are hard to
predict by the OS vendor because of the inherently local nature of
what measurements they contain, such as firmware versions of the
system and extension cards and suchlike. pcrlock can predict PCR
measurements ahead of time based on various inputs, such as the local
TPM2 event log, GPT partition tables, PE binaries, UKI kernels, and
various other things. It can then pre-calculate a TPM2 policy from
this, which it stores in an TPM2 NV index. TPM2 objects (such as disk
encryption keys) can be locked against this NV index, so that they
are locked against a specific combination of system firmware and
state. Alternatives for each component are supported to allowlist
multiple kernel versions or boot loader version simultaneously
without losing access to the disk encryption keys. The tool can also
be used to analyze and validate the local TPM2 event
log. systemd-cryptsetup, systemd-cryptenroll, systemd-repart have all
been updated to support such policies. There's currently no support
for locking the system's root disk against a pcrlock policy, this
will be added soon. Moreover, it is currently not possible to combine
a pcrlock policy with a signed PCR policy.
systemd-boot, systemd-stub, ukify, bootctl, kernel-install:
* The 90-loaderentry kernel-install hook now supports installing device
trees.
* ukify is no longer considered experimental, and now ships in /usr/bin/.
* ukify gained a new verb, inspect, that describes the sections of a UKI
and print the content of the well-known sections.
* bootctl will now show whether the system was booted from a UKI in its
status output.
* systemd-boot and systemd-stub now use different project keys in their
respective SBAT sections, so that they can be revoked individually if
needed.
* systemd-boot will no longer load unverified Devicetree blobs when UEFI
SecureBoot is enabled. For more details see:
https://github.com/systemd/systemd/security/advisories/GHSA-6m6p-rjcq-334c
* systemd-boot gained new hotkeys to reboot and power off the system
from the boot menu ("B" and "O"). If the "auto-poweroff" and
"auto-reboot" options in loader.conf are set these entries are also
shown as menu items (which is useful on devices lacking a regular
keyboard).
* systemd-boot gained a new configuration value "menu-disabled" for the
set-timeout option, to allow completely disabling the boot menu,
including the hotkey.
* systemd-boot will now measure the content of loader.conf in TPM2 PCR
5.
* systemd-stub will now concatenate the content of all kernel
command-line addons before measuring them in TPM2 PCR 12, in a single
measurement, instead of measuring them individually.
* systemd-stub will now measure and load Devicetree Blob addons, which
are searched and loaded following the same model as the existing
kernel command-line addons.
* systemd-stub will now ignore unauthenticated kernel command line options
passed from systemd-boot when running inside Confidential VMs with UEFI
SecureBoot enabled.
systemd-repart:
* A new option --copy-from= that synthesizes partition definitions from
the given image, which are then applied to the systemd-repart algorithm,
has been added.
* A new option --copy-source= has been added, which can be used to specify
a directory to which CopyFiles= is considered relative to.
* New --make-ddi=confext, --make-ddi=sysext and --make-ddi=portable options
have been added to make it easier to generate these types of DDIs,
without having to provide repart.d definitions for them.
* The dm-verity salt and UUID will now be derived from the specified
seed value.
* New VerityDataBlockSizeBytes= and VerityHashBlockSizeBytes= can now be
configured in repart.d/ configuration files.
* A new Subvolumes= setting is now supported in repart.d/ configuration
files, to indicate which directories in the target partition should be
btrfs subvolumes.
Journal:
* The journalctl --lines= parameter now accepts +N to show the oldest N
entries instead of the newest.
Device Management:
* udev will now create symlinks to loopback block devices in the
/dev/disk/by-loop-ref/ directory that are based on the .lo_file_name
string field selected during allocation. The systemd-dissect tool and
the util-linux losetup command now supports a complementing new
switch --loop-ref= for selecting the string. This means a loopback
block device may now be allocated under a caller-chosen reference and
can subsequently be referenced by that without first having to look
up the block device name the caller ended up with.
* udev also creates symlinks to loopback block devices in the
/dev/disk/by-loop-inode/ directory based on the .st_dev/st_ino fields
of the inode attached to the loopback block device. This means that
attaching a file to a loopback device will implicitly make a handle
available to be found via that file's inode information.
* udevadm info gained support for JSON output via a new --json= flag, and
for filtering output using the same mechanism that udevadm trigger
already implements.
* The predictable network interface naming logic is extended to include
the SR-IOV-R "representor" information in network interface names.
This feature was intended for v254, but even though the code was
merged, the part that actually enabled the feature was forgotten.
It is now enabled by default and is part of the new "v255" naming
scheme.
* A new hwdb/rules file has been added that sets the
ID_NET_AUTO_LINK_LOCAL_ONLY=1 udev property on all network interfaces
that should usually only be configured with link-local addressing
(IPv4LL + IPv6LL), i.e. for PC-to-PC cables ("laplink") or
Thunderbolt networking. systemd-networkd and NetworkManager (soon)
will make use of this information to apply an appropriate network
configuration by default.
* The ID_NET_DRIVER property on network interfaces is now set
relatively early in the udev rule set so that other rules may rely on
its use. This is implemented in a new "net-driver" udev built-in.
Network Management:
* The "duid-only" option for DHCPv4 client's ClientIdentifier= setting
is now dropped, as it never worked, hence it should not be used by
anyone.
* The 'prefixstable' ipv6 address generation mode now considers the
SSID when generating stable addresses, so that a different stable
address is used when roaming between wireless networks. If you
already use 'prefixstable' addresses with wireless networks, the
stable address chosen will be changed by the update.
* The DHCPv4 client gained a RapidCommit= option, default true, which
enables RFC4039 Rapid Commit behavior to obtain a lease in a
simplified 2-message exchange instead of the typical 4-message
exchange if also supported by the DHCP server.
* The DHCPv4 client gained new InitialCongestionWindow= and
InitialAdvertisedReceiveWindow= options for route configurations.
* The DHCPv4 client gained a new RequestAddress= option that allows
to send a preferred IP address in the initial DHCPDISCOVER message.
* The DHCPv4 server and client gained support for IPv6-only mode
(RFC8925).
* The SendHostname= and Hostname= options are now available for the
DHCPv6 client, independent of the DHCPv4 option, so that these
configuration values can be set independently for each client.
* The DHCPv4 and DHCPv6 client state can now be queried via D-Bus,
including lease information.
* The DHCPv6 client can now be configured to use a custom DUID type.
* .network files gained a new IPv4ReversePathFilter= setting in the
[Network] section, to control sysctl's rp_filter setting.
* .network files gaiend a new HopLimit= setting in the [Route] section,
to configure a per-route hop limit.
* .network files gained a new TCPRetransmissionTimeoutSec= setting in
the [Route] section, to configure a per-route TCP retransmission
timeout.
* A new directive NFTSet= provides a method for integrating network
configuration into firewall rules with NFT sets. The benefit of using
this setting is that static network configuration or dynamically
obtained network addresses can be used in firewall rules with the
indirection of NFT set types.
* The [IPv6AcceptRA] section supports the following new options:
UsePREF64=, UseHopLimit=, UseICMP6RateLimit= and NFTSet=.
* The [IPv6SendRA] section supports the following new options:
RetransmitSec=, HopLimit=, HomeAgent=, HomeAgentLifetimeSec= and
HomeAgentPreference=.
* A new [IPv6PREF64Prefix] set of options, containing Prefix= and
LifetimeSec=, has been introduced to append pref64 options in router
advertisements (RFC8781).
* The network generator now configures the interfaces with only
link-local addressing if ip=link-local is specified on the kernel
command line.
* The prefix of the configuration files generated by the network
generator from the kernel command line is now prefixed with '70-',
to make them have higher precedence over the default configuration
files.
* Added a new -Ddefault-network=BOOL meson option, that causes more
.network files to be installed as enabled by default. These configuration
files will which match generic setups, e.g. 89-ethernet.network matches
all Ethernet interfaces and enables both DHCPv4 and DHCPv6 clients.
* If a ID_NET_MANAGED_BY= udev property is set on a network device and
it is any other string than "io.systemd.Network" then networkd will
not manage this device. This may be used to allow multiple network
management services to run in parallel and assign ownership of
specific devices explicitly. NetworkManager will soon implement a
similar logic.
systemctl:
* systemctl is-failed now checks the system state if no unit is
specified.
* systemctl will now automatically soft-reboot if a new root file
system has been setup in /run/nextroot/ when a reboot operation
is invoked.
Login management:
* wall messages now work even when utmp support is disabled, using
systemd-logind to query the necessary information.
* systemd-logind now sends a new PrepareForShutdownWithMetadata D-Bus
signal before shutdown/reboot/soft-reboot, that includes additional
information with respect to what PrepareForShutdown has. Currently
the additional information is the type of operation that is about to
be executed.
Hibernation & Suspend:
* The kernel and OS versions will no longer be checked on resume from
hibernation.
* Hibernation into swap files backed by btrfs are now
supported. (Previously this was supported only for other file
systems.)
Other:
* A new experimental systemd-vmspawn tool has been added, that aims to
provide for VMs the same interfaces and functionality that
systemd-nspawn provides for containers. For now it supports QEMU as a
backend, and exposes some of its options to the user.
* "systemd-analyze plot" has gained tooltips on each unit name with
related-unit information in its svg output, such as Before=,
Requires=, and similar properties.
* A new varlinkctl tool has been added to allow interfacing with
Varlink services, and introspection has been added to all such
services.
* systemd-sysext and systemd-confext now expose a Varlink service
at io.systemd.sysext.
* systemd-sysupdate now accepts directories in the MatchPattern= option.
* systemd-run will now output the invocation ID of the launched
transient unit.
* systemd-analyze, systemd-tmpfiles, systemd-sysusers, systemd-sysctl,
and systemd-binfmt gained a new --tldr option that can be used in
combination with --cat-config to suppress uninteresting configuration
lines, such as comments.
* resolvectl gained a new "show-server-state" command that shows
current statistics of the resolver. This is backed by a new
DumpStatistics() Varlink method provided by systemd-resolved.
* systemd-timesyncd will now emit a D-Bus signal when the LinkNTPServers
property changes.
* vconsole now supports KEYMAP=@kernel for preserving the kernel keymap
as-is.
* seccomp now supports the LoongArch64 architecture.
* systemd-id128 now supports a new -P option to show only values, and
combining --app with the show verb.
* A new pam_systemd_loadkey.so PAM module is now available, which
allows automatically fetching the passphrase used by cryptsetup to
unlock the root file system and setting it as the PAM authtok. This
enables, among other things, configuring auto-unlock of the GNOME
Keyring / KDE Wallet when autologin is configured.
* Many meson options now use the 'feature' type, which means they
take enabled/disabled/auto as values.
* A new meson option configfiledir can be used to change where
configuration files with default values are installed to.
* Options and verbs in man pages are now tagged with the version they
were first introduced in.
* A new component "systemd-storagetm" has been added, which exposes all
local block devices as NVMe-TCP devices, fully automatically. It's
hooked into a new target unit storage-target-mode.target that is
suppsoed to be booted into via
rd.systemd.unit=storage-target-mode.target on the kernel command
line. This is intended to be used for installers and debugging to
quickly get access to the local disk. It's inspired by MacOS "target
disk mode".
* A new component "systemd-bsod" has been added, which can show logged
error messages full screen, if they have a log level of LOG_EMERG log
level.
* The systemd-dissect tool's --with command will now set the
$SYSTEMD_DISSECT_DEVICE environment variable to the block device it
operates on for the invoked process.
* The systemd-mount tool gained a new --tmpfs switch for mounting a new
'tmpfs' instance. This is useful since it does so via .mount units
and thus can be executed remotely or in containers.
* The various tools in systemd that take "verbs" (such as systemctl,
loginctl, machinectl, …) now will suggest a close verb name in case
the user specified an unrecognized one.
* libsystemd now exports a new function sd_id128_get_app_specific()
that generates "app-specific" 128bit IDs from any ID. It's similar to
sd_id128_get_machine_app_specific() and
sd_id128_get_boot_app_specific() but takes the ID to base calculation
on as input. This new functionality is also exposed in the
"systemd-id128" tool where you can now combine --app= with `show`.
* All tools that parse timestamps now can also parse RFC3339 style
timestamps that include the "T" and Z" characters.
* New documentation as been added:
https://systemd.io/FILE_DESCRIPTOR_STORE
https://systemd.io/TPM2_PCR_MEASUREMENTS
https://systemd.io/MOUNT_REQUIREMENTS.md
* The codebase now recognizes the suffix .confext.raw and .sysext.raw
as alternative to the .raw suffix generally accepted for DDIs. It is
recommended to name configuration extensions and system extensions
with such suffixes, to indicate their purpose in the name.
* The sd-device API gained a new function
sd_device_enumerator_add_match_property_required() which allows
configuring matches on properties that are strictly required. This is
different from the existing sd_device_enumerator_add_match_property()
matches of which one one needs to apply.
* The MAC adress the veth side of an nspawn container shall get
assigned may now be controlled via the $SYSTEMD_NSPAWN_NETWORK_MAC
environment variable.
* The libiptc dependency is now implemented via dlopen(), so that tools
such as networkd and nspawn no longer have a hard dependency on the
shared library when compiled with support for libiptc.
Contributions from: 김인수, Abderrahim Kitouni, Adam Williamson,
Alexandre Peixoto Ferreira, Alex Hudspith, Alvin Alvarado,
André Paiusco, Antonio Alvarez Feijoo, Anton Lundin,
Arseny Maslennikov, Arthur Shau, Balázs Úr, beh_10257,
Benjamin Peterson, Bertrand Jacquin, Brian Norris, Chris Patterson,
Christian Hergert, Christian Hesse, Christian Kirbach, commondservice,
Curtis Klein, cvlc12, Daan De Meyer, Daniel P. Berrangé, Daniel Rusek,
Dan Streetman, David Rheinsberg, David Santamaría Rogado, David Tardon,
dependabot[bot], Dmitry V. Levin, Emanuele Giuseppe Esposito,
Emil Renner Berthing, Emil Velikov, Etienne Dechamps, Fabian Vogt,
felixdoerre, Franck Bui, Frantisek Sumsal, G2-Games, Gioele Barabucci,
Hugo Carvalho, huyubiao, IllusionMan1212, Jade Lovelace, janana,
Jan Janssen, Jan Kuparinen, Jan Macku, Jin Liu, Joerg Behrmann,
Johannes Segitz, Jordan Rome, Jordan Williams, Julien Malka,
Juno Computers, Khem Raj, khm, Kingbom Dou, Kiran Vemula,
Laszlo Gombos, Lennart Poettering, Luca Boccassi, Lucas Adriano Salles,
Lukas, Maanya Goenka, Maarten, Malte Poll, Marc Pervaz Boocha,
Martin Beneš, Martin Wilck, Mathieu Tortuyaux, Matthias Schiffer,
Maxim Mikityanskiy, Max Kellermann, Michael A Cassaniti, Michael Biebl,
Michael Kuhn, Michael Vasseur, Michal Koutný, Michal Sekletár,
Mike Yuan, Milton D. Miller II, mordner, msizanoen, NAHO,
Nandakumar Raghavan, Nick Rosbrook, NRK, Oğuz Ersen, Omojola Joshua,
pelaufer, Peter Hutterer, PhylLu, Pierre GRASSER, Piotr Drąg,
Priit Laes, Rahil Bhimjiani, Raito Bezarius, Raul Cheleguini,
Reto Schneider, Richard Maw, Robby Red, RoepLuke, Roland Hieber,
Ronan Pigott, Sam James, Sergey A, Susant Sahani, Sven Joachim,
Takashi Sakamoto, Thorsten Kukuk, Tj, Tomasz Świątek, Topi Miettinen,
Valentin David, Valentin Lefebvre, Victor Westerhuis, Vincent Haupert,
Vishal Chillara Srinivas, Warren, Xiaotian Wu, xinpeng wang,
Yu Watanabe, Zbigniew Jędrzejewski-Szmek, наб
CHANGES WITH 254:
Announcements of Future Feature Removals and Incompatible Changes:
* The next release (v255) will remove support for split-usr (/usr/
mounted separately during late boot, instead of being mounted by the
initrd before switching to the rootfs) and unmerged-usr (parallel
directories /bin/ and /usr/bin/, /lib/ and /usr/lib/, …). For more
details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
* We intend to remove cgroup v1 support from a systemd release after
the end of 2023. If you run services that make explicit use of
cgroup v1 features (i.e. the "legacy hierarchy" with separate
hierarchies for each controller), please implement compatibility with
cgroup v2 (i.e. the "unified hierarchy") sooner rather than later.
Most of Linux userspace has been ported over already.
* Support for System V service scripts is now deprecated and will be
removed in a future release. Please make sure to update your software
*now* to include a native systemd unit file instead of a legacy
System V script to retain compatibility with future systemd releases.
* Support for the SystemdOptions EFI variable is deprecated.
'bootctl systemd-efi-options' will emit a warning when used. It seems
that this feature is little-used and it is better to use alternative
approaches like credentials and confexts. The plan is to drop support
altogether at a later point, but this might be revisited based on
user feedback.
* EnvironmentFile= now treats the line following a comment line
trailing with escape as a non comment line. For details, see:
https://github.com/systemd/systemd/issues/27975
* PrivateNetwork=yes and NetworkNamespacePath= now imply
PrivateMounts=yes unless PrivateMounts=no is explicitly specified.
* Behaviour of sandboxing options for the per-user service manager
units has changed. They now imply PrivateUsers=yes, which means user
namespaces will be implicitly enabled when a sandboxing option is
enabled in a user unit. Enabling user namespaces has the drawback
that system users will no longer be visible (and processes/files will
appear as owned by 'nobody') in the user unit.
By definition a sandboxed user unit should run with reduced
privileges, so impact should be small. This will remove a great
source of confusion that has been reported by users over the years,
due to how these options require an extra setting to be manually
enabled when used in the per-user service manager, which is not
needed in the system service manager. For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html
* systemd-run's switch --expand-environment= which currently is disabled
by default when combined with --scope, will be changed in a future
release to be enabled by default.
Security Relevant Changes:
* pam_systemd will now by default pass the CAP_WAKE_ALARM ambient
process capability to invoked session processes of regular users on
local seats (as well as to systemd --user), unless configured
otherwise via data from JSON user records, or via the PAM module's
parameter list. This is useful in order allow desktop tools such as
GNOME's Alarm Clock application to set a timer for
CLOCK_REALTIME_ALARM that wakes up the system when it elapses. A
per-user service unit file may thus use AmbientCapability= to pass
the capability to invoked processes. Note that this capability is
relatively narrow in focus (in particular compared to other process
capabilities such as CAP_SYS_ADMIN) and we already — by default —
permit more impactful operations such as system suspend to local
users.
Service Manager:
* Memory limits that apply while the unit is activating are now
supported. Previously IO and CPU settings were already supported via
StartupCPUWeight= and similar. The same logic has been added for the
various manager and unit memory settings (DefaultStartupMemoryLow=,
StartupMemoryLow=, StartupMemoryHigh=, StartupMemoryMax=,
StartupMemorySwapMax=, StartupMemoryZSwapMax=).
* The service manager gained support for enqueuing POSIX signals to
services that carry an additional integer value, exposing the
sigqueue() system call. This is accessible via new D-Bus calls
org.freedesktop.systemd1.Manager.QueueSignalUnit() and
org.freedesktop.systemd1.Unit.QueueSignal(), as well as in systemctl
via the new --kill-value= option.
* systemctl gained a new "list-paths" verb, which shows all currently
active .path units, similarly to how "systemctl list-timers" shows
active timers, and "systemctl list-sockets" shows active sockets.
* systemctl gained a new --when= switch which is honoured by the various
forms of shutdown (i.e. reboot, kexec, poweroff, halt) and allows
scheduling these operations by time, similar in fashion to how this
has been supported by SysV shutdown.
* If MemoryDenyWriteExecute= is enabled for a service and the kernel
supports the new PR_SET_MDWE prctl() call, it is used instead of the
seccomp()-based system call filter to achieve the same effect.
* A new set of kernel command line options is now understood:
systemd.tty.term.<name>=, systemd.tty.rows.<name>=,
systemd.tty.columns.<name>= allow configuring the TTY type and
dimensions for the tty specified via <name>. When systemd invokes a
service on a tty (via TTYName=) it will look for these and configure
the TTY accordingly. This is particularly useful in VM environments
to propagate host terminal settings into the appropriate TTYs of the
guest.
* A new RootEphemeral= setting is now understood in service units. It
takes a boolean argument. If enabled for services that use RootImage=
or RootDirectory= an ephemeral copy of the disk image or directory
tree is made when the service is started. It is removed automatically
when the service is stopped. That ephemeral copy is made using
btrfs/xfs reflinks or btrfs snapshots, if available.
* The service activation logic gained new settings RestartSteps= and
RestartMaxDelaySec= which allow exponentially-growing restart
intervals for Restart=.
* The service activation logic gained a new setting RestartMode= which
can be set to 'direct' to skip the inactive/failed states when
restarting, so that dependent units are not notified until the service
converges to a final (successful or failed) state. For example, this
means that OnSuccess=/OnFailure= units will not be triggered until the
service state has converged.
* PID 1 will now automatically load the virtio_console kernel module
during early initialization if running in a suitable VM. This is done
so that early-boot logging can be written to the console if available.
* Similarly, virtio-vsock support is loaded early in suitable VM
environments. PID 1 will send sd_notify() notifications via AF_VSOCK
to the VMM if configured, thus loading this early is beneficial.
* A new verb "fdstore" has been added to systemd-analyze to show the
current contents of the file descriptor store of a unit. This is
backed by a new D-Bus call DumpUnitFileDescriptorStore() provided by
the service manager.
* The service manager will now set a new $FDSTORE environment variable
when invoking processes for services that have the file descriptor
store enabled.
* A new service option FileDescriptorStorePreserve= has been added that
allows tuning the life-cycle of the per-service file descriptor
store. If set to "yes", the entries in the fd store are retained even
after the service has been fully stopped.
* The "systemctl clean" command may now be used to clear the fdstore of
a service.
* Unit *.preset files gained a new directive "ignore", in addition to
the existing "enable" and "disable". As the name suggests, matching
units are left unchanged, i.e. neither enabled nor disabled.
* Service units gained a new setting DelegateSubgroup=. It takes the
name of a sub-cgroup to place any processes the service manager forks
off in. Previously, the service manager would place all service
processes directly in the top-level cgroup it created for the
service. This usually meant that main process in a service with
delegation enabled would first have to create a subgroup and move
itself down into it, in order to not conflict with the "no processes
in inner cgroups" rule of cgroup v2. With this option, this step is
now handled by PID 1.
* The service manager will now look for .upholds/ directories,
similarly to the existing support for .wants/ and .requires/
directories. Symlinks in this directory result in Upholds=
dependencies.
The [Install] section of unit files gained support for a new
UpheldBy= directive to generate .upholds/ symlinks automatically when
a unit is enabled.
* The service manager now supports a new kernel command line option
systemd.default_device_timeout_sec=, which may be used to override
the default timeout for .device units.
* A new "soft-reboot" mechanism has been added to the service manager.
A "soft reboot" is similar to a regular reboot, except that it
affects userspace only: the service manager shuts down any running
services and other units, then optionally switches into a new root
file system (mounted to /run/nextroot/), and then passes control to a
systemd instance in the new file system which then starts the system
up again. The kernel is not rebooted and neither is the hardware,
firmware or boot loader. This provides a fast, lightweight mechanism
to quickly reset or update userspace, without the latency that a full
system reset involves. Moreover, open file descriptors may be passed
across the soft reboot into the new system where they will be passed
back to the originating services. This allows pinning resources
across the reboot, thus minimizing grey-out time further. This new
reboot mechanism is accessible via the new "systemctl soft-reboot"
command.
* Services using RootDirectory= or RootImage= will now have read-only
access to a copy of the host's os-release file under
/run/host/os-release, which will be kept up-to-date on 'soft-reboot'.
This was already the case for Portable Services, and the feature has
now been extended to all services that do not run off the host's
root filesystem.
* A new service setting MemoryKSM= has been added to enable kernel
same-page merging individually for services.
* A new service setting ImportCredentials= has been added that augments
LoadCredential= and LoadCredentialEncrypted= and searches for
credentials to import from the system, and supports globbing.
* A new job mode "restart-dependencies" has been added to the service
manager (exposed via systemctl --job-mode=). It is only valid when
used with "start" jobs, and has the effect that the "start" job will
be propagated as "restart" jobs to currently running units that have
a BindsTo= or Requires= dependency on the started unit.
* A new verb "whoami" has been added to "systemctl" which determines as
part of which unit the command is being invoked. It writes the unit
name to standard output. If one or more PIDs are specified reports
the unit names the processes referenced by the PIDs belong to.
* The system and service credential logic has been improved: there's
now a clearly defined place where system provisioning tools running
in the initrd can place credentials that will be imported into the
system's set of credentials during the initrd → host transition: the
/run/credentials/@initrd/ directory. Once the credentials placed
there are imported into the system credential set they are deleted
from this directory, and the directory itself is deleted afterwards
too.
* A new kernel command line option systemd.set_credential_binary= has
been added, that is similar to the pre-existing
systemd.set_credential= but accepts arbitrary binary credential data,
encoded in Base64. Note that the kernel command line is not a
recommend way to transfer credentials into a system, since it is
world-readable from userspace.
* The default machine ID to use may now be configured via the
system.machine_id system credential. It will only be used if no
machine ID was set yet on the host.
* On Linux kernel 6.4 and newer system and service credentials will now
be placed in a tmpfs instance that has the "noswap" mount option
set. Previously, a "ramfs" instance was used. By switching to tmpfs
ACL support and overall size limits can now be enforced, without
compromising on security, as the memory is never paged out either
way.
* The service manager now can detect when it is running in a
'Confidential Virtual Machine', and a corresponding 'cvm' value is now
accepted by ConditionSecurity= for units that want to conditionalize
themselves on this. systemd-detect-virt gained new 'cvm' and
'--list-cvm' switches to respectively perform the detection or list
all known flavours of confidential VM, depending on the vendor. The
manager will publish a 'ConfidentialVirtualization' D-Bus property,
and will also set a SYSTEMD_CONFIDENTIAL_VIRTUALIZATION= environment
variable for unit generators. Finally, udev rules can match on a new
'cvm' key that will be set when in a confidential VM.
Additionally, when running in a 'Confidential Virtual Machine', SMBIOS
strings and QEMU's fw_cfg protocol will not be used to import
credentials and kernel command line parameters by the system manager,
systemd-boot and systemd-stub, because the hypervisor is considered
untrusted in this particular setting.
Journal:
* The sd-journal API gained a new call sd_journal_get_seqnum() to
retrieve the current log record's sequence number and sequence number
ID, which allows applications to order records the same way as
journal does internally. The sequence number is now also exported in
the JSON and "export" output of the journal.
* journalctl gained a new switch --truncate-newline. If specified
multi-line log records will be truncated at the first newline,
i.e. only the first line of each log message will be shown.
* systemd-journal-upload gained support for --namespace=, similar to
the switch of the same name of journalctl.
systemd-repart:
* systemd-repart's drop-in files gained a new ExcludeFiles= option which
may be used to exclude certain files from the effect of CopyFiles=.
* systemd-repart's Verity support now implements the Minimize= setting
to minimize the size of the resulting partition.
* systemd-repart gained a new --offline= switch, which may be used to
control whether images shall be built "online" or "offline",
i.e. whether to make use of kernel facilities such as loopback block
devices and device mapper or not.
* If systemd-repart is told to populate a newly created ESP or XBOOTLDR
partition with some files, it will now default to VFAT rather than
ext4.
* systemd-repart gained a new --architecture= switch. If specified, the
per-architecture GPT partition types (i.e. the root and /usr/
partitions) configured in the partition drop-in files are
automatically adjusted to match the specified CPU architecture, in
order to simplify cross-architecture DDI building.
* systemd-repart will now default to a minimum size of 300MB for XFS
filesystems if no size parameter is specified. This matches what the
XFS tools (xfsprogs) can support.
systemd-boot, systemd-stub, ukify, bootctl, kernel-install:
* gnu-efi is no longer required to build systemd-boot and systemd-stub.
Instead, pyelftools is now needed, and it will be used to perform the
ELF -> PE relocations at build time.
* bootctl gained a new switch --print-root-device/-R that prints the
block device the root file system is backed by. If specified twice,
it returns the whole disk block device (as opposed to partition block
device) the root file system is on. It's useful for invocations such
as "cfdisk $(bootctl -RR)" to quickly show the partition table of the
running OS.
* systemd-stub will now look for the SMBIOS Type 1 field
"io.systemd.stub.kernel-cmdline-extra" and append its value to the
kernel command line it invokes. This is useful for VMMs such as qemu
to pass additional kernel command lines into the system even when
booting via full UEFI. The contents of the field are measured into
TPM PCR 12.
* The KERNEL_INSTALL_LAYOUT= setting for kernel-install gained a new
value "auto". With this value, a kernel will be automatically
analyzed, and if it qualifies as UKI, it will be installed as if the
setting was to set to "uki", otherwise as "bls".
* systemd-stub can now optionally load UEFI PE "add-on" images that may
contain additional kernel command line information. These "add-ons"
superficially look like a regular UEFI executable, and are expected
to be signed via SecureBoot/shim. However, they do not actually
contain code, but instead a subset of the PE sections that UKIs
support. They are supposed to provide a way to extend UKIs with
additional resources in a secure and authenticated way. Currently,
only the .cmdline PE section may be used in add-ons, in which case
any specified string is appended to the command line embedded into
the UKI itself. A new 'addon<EFI-ARCH>.efi.stub' is now provided that
can be used to trivially create addons, via 'ukify' or 'objcopy'. In
the future we expect other sections to be made extensible like this as
well.
* ukify has been updated to allow building these UEFI PE "add-on"
images, using the new 'addon<EFI-ARCH>.efi.stub'.
* ukify gained a new "genkey" verb for generating a set of of key pairs
to sign UKIs and their PCR data with.
* ukify now accepts SBAT information to place in the .sbat PE section
of UKIs and addons. If a UKI is built the SBAT information from the
inner kernel is merged with any SBAT information associated with
systemd-stub and the SBAT data specified on the ukify command line.
* The kernel-install script has been rewritten in C, and reuses much of
the infrastructure of existing tools such as bootctl. It also gained
--esp-path= and --boot-path= options to override the path to the ESP,
and the $BOOT partition. Options --make-entry-directory= and
--entry-token= have been added as well, similar to bootctl's options
of the same name.
* A new kernel-install plugin 60-ukify has been added which will
combine kernel/initrd locally into a UKI and optionally sign them
with a local key. This may be used to switch to UKI mode even on
systems where a local kernel or initrd is used. (Typically UKIs are
built and signed by the vendor.)
* The ukify tool now supports "pesign" in addition to the pre-existing
"sbsign" for signing UKIs.
* systemd-measure and systemd-stub now look for the .uname PE section
that should contain the kernel's "uname -r" string.
* systemd-measure and ukify now calculate expected PCR hashes for a UKI
"offline", i.e. without access to a TPM (physical or
software-emulated).
Memory Pressure & Control:
* The sd-event API gained new calls sd_event_add_memory_pressure(),
sd_event_source_set_memory_pressure_type(),
sd_event_source_set_memory_pressure_period() to create and configure
an event source that is called whenever the OS signals memory
pressure. Another call sd_event_trim_memory() is provided that
compacts the process' memory use by releasing allocated but unused
malloc() memory back to the kernel. Services can also provide their
own custom callback to do memory trimming. This should improve system
behaviour under memory pressure, as on Linux traditionally provided
no mechanism to return process memory back to the kernel if the
kernel was under memory pressure. This makes use of the kernel's PSI
interface. Most long-running services in systemd have been hooked up
with this, and in particular systems with low memory should benefit
from this.
* Service units gained new settings MemoryPressureWatch= and
MemoryPressureThresholdSec= to configure the PSI memory pressure
logic individually. If these options are used, the
$MEMORY_PRESSURE_WATCH and $MEMORY_PRESSURE_WRITE environment
variables will be set for the invoked processes to inform them about
the requested memory pressure behaviour. (This is used by the
aforementioned sd-events API additions, if set.)
* systemd-analyze gained a new "malloc" verb that shows the output
generated by glibc's malloc_info() on services that support it. Right
now, only the service manager has been updated accordingly. This
call requires privileges.
User & Session Management:
* The sd-login API gained a new call sd_session_get_username() to
return the user name of the owner of a login session. It also gained
a new call sd_session_get_start_time() to retrieve the time the login
session started. A new call sd_session_get_leader() has been added to
return the PID of the "leader" process of a session. A new call
sd_uid_get_login_time() returns the time since the specified user has
most recently been continuously logged in with at least one session.
* JSON user records gained a new set of fields capabilityAmbientSet and
capabilityBoundingSet which contain a list of POSIX capabilities to
set for the logged in users in the ambient and bounding sets,
respectively. homectl gained the ability to configure these two sets
for users via --capability-bounding-set=/--capability-ambient-set=.
* pam_systemd learnt two new module options
default-capability-bounding-set= and default-capability-ambient-set=,
which configure the default bounding sets for users as they are
logging in, if the JSON user record doesn't specify this explicitly
(see above). The built-in default for the ambient set now contains
the CAP_WAKE_ALARM, thus allowing regular users who may log in
locally to resume from a system suspend via a timer.
* The Session D-Bus objects systemd-logind gained a new SetTTY() method
call to update the TTY of a session after it has been allocated. This
is useful for SSH sessions which are typically allocated first, and
for which a TTY is added later.
* The sd-login API gained a new call sd_pid_notifyf_with_fds() which
combines the various other sd_pid_notify() flavours into one: takes a
format string, an overriding PID, and a set of file descriptors to
send. It also gained a new call sd_pid_notify_barrier() call which is
equivalent to sd_notify_barrier() but allows the originating PID to
be specified.
* "loginctl list-users" and "loginctl list-sessions" will now show the
state of each logged in user/session in their tabular output. It will
also show the current idle state of sessions.
DDIs: