Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(NFSSR): ensure we can attach SR during attach_from_config call #44

Open
wants to merge 20 commits into
base: 2.30.8-8.2
Choose a base branch
from

Conversation

Wescoeur
Copy link
Member

We can get a trace like that if the SR is not attached:

2170:Oct 10 16:02:59 xcp4 SM: [2564] ***** NFSFileVDI.attach_from_config: EXCEPTION <type 'exceptions.AttributeError'>, 'NoneType' object has no attribute 'xenapi'
2329-Oct 10 16:02:59 xcp4 SM: [2564]   File "/opt/xensource/sm/NFSSR", line 296, in attach_from_config
2427-Oct 10 16:02:59 xcp4 SM: [2564]     self.sr.attach(sr_uuid)
2487-Oct 10 16:02:59 xcp4 SM: [2564]   File "/opt/xensource/sm/NFSSR", line 148, in attach
2573-Oct 10 16:02:59 xcp4 SM: [2564]     self._check_hardlinks()
2633-Oct 10 16:02:59 xcp4 SM: [2564]   File "/opt/xensource/sm/FileSR.py", line 1122, in _check_hardlinks
2734-Oct 10 16:02:59 xcp4 SM: [2564]     self.session.xenapi.SR.remove_from_sm_config(
2816-Oct 10 16:02:59 xcp4 SM: [2564]

Because the session is not set during this call.
So instead of using the XenAPI to store hardlink support, use a file on the storage itself.

@Wescoeur Wescoeur requested review from stormi and benjamreis October 12, 2023 09:09
@Wescoeur Wescoeur force-pushed the 2.30.8-8.2-filesr-hardlink-check-without-xapi branch 2 times, most recently from d3bb3ca to 2c33ec0 Compare October 12, 2023 10:03
MarkSymsCtx and others added 19 commits October 13, 2023 14:43
…probe calls

Signed-off-by: Mark Syms <[email protected]>
Signed-off-by: Ronan Abhamon <[email protected]>
This was a patch added to the sm RPM git repo before we had this
forked git repo for sm in the xcp-ng github organisation.
This was a patch added to the sm RPM git repo before we had this
forked git repo for sm in the xcp-ng github organisation.
The driver is needed to transition to the ext driver.
Users who upgrade from XCP-ng <= 8.0 need a working driver so that they
can move the VMs out of the ext4 SR and delete the SR.

Not keeping that driver would force such users to upgrade to 8.1 first,
convert their SR, then upgrade to a higher version.

However, like in XCP-ng 8.1, the driver will refuse any new ext4 SR
creation.
Some important points:

- linstor.KV must use an identifier name that starts with a letter (so it uses a "sr-" prefix).

- Encrypted VDI are supported with key_hash attribute (not tested, experimental).

- When a new LINSTOR volume is created on a host (via snapshot or create), the remaining diskless
devices are not necessarily created on other hosts. So if a resource definition exists without
local device path, we ask it to LINSTOR. Wait 5s for symlink creation when a new volume
is created => 5s is is purely arbitrary, but this guarantees that we do not try to access the
volume if the symlink has not yet been created by the udev rule.

- Can change the provisioning using the device config 'provisioning' param.

- We can only increase volume size (See: LINBIT/linstor-server#66),
it would be great if we could shrink volumes to limit the space used by the snapshots.

- Inflate/Deflate can only be executed on the master host, a linstor-manager plugin is present
to do this from slaves. The same plugin is used to open LINSTOR ports + start controller.

- Use a `total_allocated_volume_size` method to have a good idea of the reserved memory
Why? Because `physical_free_size` is computed using the LVM used size, in the case of thick provisioning it's ok,
but when thin provisioning is choosen LVM returns only the allocated size using the used block count. So this method
solves this problem, it takes the fixed virtual volume size of each node to compute the required size to store the
volume data.

- Call vhd-util on remote hosts using the linstor-manager when necessary, i.e. vhd-util is called to get vhd info,
the DRBD device can be in use (and unusable by external processes), so we must use the local LVM device that
contains the DRBD data or a remote disk if the DRBD device is diskless.

- If a DRBD device is in use when vhdutil.getVHDInfo is called, we must have no
errors. So a LinstorVhdUtil wrapper is now used to bypass DRBD layer when
VDIs are loaded.

- Refresh PhyLink when unpause in called on DRBD devices:
We must always recreate the symlink to ensure we have
the right info. Why? Because if the volume UUID is changed in
LINSTOR the symlink is not directly updated. When live leaf
coalesce is executed we have these steps:
"A" -> "OLD_A"
"B" -> "A"
Without symlink update the previous "A" path is reused instead of
"B" path. Note: "A", "B" and "OLD_A" are UUIDs.

- Since linstor python modules are not present on every XCP-ng host,
module imports are protected by try.. except... blocks.

- Provide a linstor-monitor daemon to check master changes
- Check if "create" doesn't succeed without zfs packages
- Check if "scan" failed if the path is not mounted (not a ZFS mountpoint)
Some QNAP devices do not provide ACL when fetching NFS mounts.
In this case the assumed ACL should be: "*".

This commit fixes the crash when attempting to access the non existing ACL.
Relevant issues:
- xapi-project#511
- xcp-ng/xcp#113
Co-authored-by: Piotr Robert Konopelko <[email protected]>
Signed-off-by: Aleksander Wieliczko <[email protected]>
Signed-off-by: Ronan Abhamon <[email protected]>
`umount` should not be called when `legacy_mode` is enabled, otherwise a mounted dir
used during SR creation is unmounted at the end of the `create` call (and also
when a PBD is unplugged) in `detach` block.

Signed-off-by: Ronan Abhamon <[email protected]>
A sm-config boolean param `subdir` is available to configure where to store the VHDs:
- In a subdir with the SR UUID, the new behavior
- In the root directory of the MooseFS SR

By default, new SRs are created with `subdir` = True.
Existing SRs  are not modified and continue to use the folder that was given at
SR creation, directly, without looking for a subdirectory.

Signed-off-by: Ronan Abhamon <[email protected]>
Ensure all shared drivers are imported in `_is_open` definition to register
them in the driver list. Otherwise this function always fails with a SRUnknownType exception.

Also, we must add two fake mandatory parameters to make MooseFS happy: `masterhost` and `rootpath`.
Same for CephFS with: `serverpath`. (NFS driver is directly patched to ensure there is no usage of
the `serverpath` param because its value is equal to None.)

`location` param is required to use ZFS, to be more precise, in the parent class: `FileSR`.

Signed-off-by: Ronan Abhamon <[email protected]>
SR_CACHING offers the capacity to use IntelliCache, but this
feature is only available using NFS SR.

For more details, the implementation of `_setup_cache` in blktap2.py
uses only an instance of NFSFileVDI for the shared target.

Signed-off-by: Ronan Abhamon <[email protected]>
The probe method is not implemented so we
shouldn't advertise it.

Signed-off-by: BenjiReis <[email protected]>
When static vdis are used there is no snapshots and we don't want to
call method from XAPI.

Signed-off-by: Guillaume <[email protected]>
This file is meant to remain unchanged and regularly updated along with
the SM component. Users can create a custom configuration file in
/etc/multipath/conf.d/ instead.

Signed-off-by: Samuel Verschelde <[email protected]>
(cherry picked from commit b44d3f5)
Meant to be installed as /etc/multipath/conf.d/custom.conf for users
to have an easy entry point for editing, as well as information on what
will happen to this file through future system updates and upgrades.

Signed-off-by: Samuel Verschelde <[email protected]>
(cherry picked from commit 18b79a5)
Update Makefile so that the file is installed along with sm.

Signed-off-by: Samuel Verschelde <[email protected]>
@stormi
Copy link
Member

stormi commented Oct 16, 2023

The PR needs to be rebased, since the 2.30.8-8.2 branch's history changed.

@stormi
Copy link
Member

stormi commented Oct 16, 2023

Is the change safe for 8.2 LTS?

We can get a trace like that if the SR is not attached:
```
2170:Oct 10 16:02:59 xcp4 SM: [2564] ***** NFSFileVDI.attach_from_config: EXCEPTION <type 'exceptions.AttributeError'>, 'NoneType' object has no attribute 'xenapi'
2329-Oct 10 16:02:59 xcp4 SM: [2564]   File "/opt/xensource/sm/NFSSR", line 296, in attach_from_config
2427-Oct 10 16:02:59 xcp4 SM: [2564]     self.sr.attach(sr_uuid)
2487-Oct 10 16:02:59 xcp4 SM: [2564]   File "/opt/xensource/sm/NFSSR", line 148, in attach
2573-Oct 10 16:02:59 xcp4 SM: [2564]     self._check_hardlinks()
2633-Oct 10 16:02:59 xcp4 SM: [2564]   File "/opt/xensource/sm/FileSR.py", line 1122, in _check_hardlinks
2734-Oct 10 16:02:59 xcp4 SM: [2564]     self.session.xenapi.SR.remove_from_sm_config(
2816-Oct 10 16:02:59 xcp4 SM: [2564]
```

Because the session is not set during this call.
So instead of using the XenAPI to store hardlink support, use a file on the storage itself.

Signed-off-by: Ronan Abhamon <[email protected]>
@Wescoeur Wescoeur force-pushed the 2.30.8-8.2-filesr-hardlink-check-without-xapi branch from 2c33ec0 to b04358e Compare October 16, 2023 12:46
@Wescoeur
Copy link
Member Author

From my POV yes and it fixes a bad attach required by xha using an NFS SR when a host is rebooted.

@stormi
Copy link
Member

stormi commented Oct 16, 2023

We need a card for this in the board.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants