-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update smartctl.py #46
base: master
Are you sure you want to change the base?
Conversation
Try to add hard raid support for smartctl.py Signed-off-by: 1234Erwan <[email protected]>
Signed-off-by: 1234Erwan <[email protected]>
Hey thanks a lot for your contribution! We'll review your PR ASAP 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx a lot for your PR. I added some comments.
@@ -14,32 +14,47 @@ def _list_disks(): | |||
disks = [] | |||
result = run_command(['smartctl', '--scan']) | |||
for line in result['stdout'].splitlines(): | |||
if line.startswith('/dev/') and not line.startswith('/dev/bus/'): | |||
disks.append(line.split()[0]) | |||
disks.append(line.split()[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you can get the type of the disk here. No need to an extra function _list_raids.
Maybe return a tuple here with line.split()[0] and line.split()[2]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'v tryed te solve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the most readable way to do that is to use a dict. For example you can do:
devices.append({'name': line.split()[0], 'type': line.split()[2]})
By doing this way we know what is expecting and when calling the smartctl
later in the code you won't need the index you will be able to use the explicit field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I used devices
here because it looks like it is closer from the man page of the smartctl and IMHO we are getting the device name and the device type so it looks better :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also renamed _list_disks
into _list_devices
if you decide to rename it.
return disks | ||
|
||
@error_wrapped | ||
def _list_raids(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like you are getting the type. It is not specifically for raids. And I think that this function is not needed because we are already doing the scan in _list_disks(). See my previous comment
@error_wrapped | ||
def get_information(session, args): | ||
results = {} | ||
raids = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed if disks holds a tuple with device and type
with OperationLocker(): | ||
disks = _list_disks() | ||
raids = _list_raids() | ||
for disk in disks: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you can get disk, disk_type from disks if it holds the tuple
Removed unnecessary lines from smartctl.py Signed-off-by: 1234Erwan <[email protected]>
I thought about a solution like this instead of managing an index.
|
How do we move on on this PR? |
@
I think that managing index is error prone and IMHO using the solution I posted looks better but I don't have a strong opinion so if it is ok for you we can merge it and fix it later. @1234Erwan what do you think? |
Also I just tested the patch on a physical machine that has a megaraid and it fails:
In fact the issue is because the command
But the return code is not 0... it is 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
health commands failed on my host, pytest fails and pycodecheck also fails.
No worries we will have a look.
In fact the exit code status means : |
my python skills are quite limited, what is sure is that even if my script is not clean at all it works on my hard raid |
When you say it works you mean also the |
Add usage of tuples and disk renamed by device. Signed-off-by: 1234Erwan <[email protected]>
@gthvn1 Thanks for all your suggestion and for your help, i'v just implemented your idea. |
results[disk] = json.loads(cmd['stdout']) | ||
devices = _list_devices() | ||
for device in devices: | ||
cmd = run_command(["smartctl", "-j", "-a", "-d", devices[i]['name'], devices[i]['type']], check=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you don't need the index anymore. Now you can use device['name']
instead of devices[i]['name']
. So you can remove the init i = 0
and also the increment i = i + 1
.
cmd = run_command(["smartctl", "-j", "-H", disk]) | ||
devices = _list_devices() | ||
for device in devices: | ||
cmd = run_command(["smartctl", "-j", "-H", "-d", devices[i]['name'], devices[i]['type']]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. If I'm right you don't need index anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I understand why it fails on my machine. It is because the exit status of the command is 4. In you case I think that it is 0 so it is ok. To fix my machine you just need to add the parameter check=False)
at the end of the call. Like this: cmd = run_command(["smartctl", "-j", "-H", "-d", device['name'], device['type']], check=False)
Signed-off-by: 1234Erwan <[email protected]>
It's done, you've finnaly wrote the entierly of the code yourself 😅. |
Your code helps us to think about the issue and even if it changed a little bit it helped us a lot 👍 |
And for information I have created a branch gtn-pr-46 that has your current commit. The reason is because as it is a local branch it triggers more check (like pytest and pycodestyle) than yours. So don't worry I will use this new branch and fix the tests but your commits will be in. I will probably squash them into one commit but it will be there and your contribution will remain :) |
Fine, do as you please. I don't care much for my meager contribution. I much prefer to know that smartctl will work for everyone. |
No it didn't working, as you've said earlier the exit status is not 0. |
devices = _list_devices() | ||
for device in devices: | ||
cmd = run_command(["smartctl", "-j", "-a", "-d", device['name'], device['type']], check=False) | ||
results[device] = json.loads(cmd['stdout']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't work because device is a dict and cannot be used as a key. The problem with megraid is that we now have 2 devices with the same name. Something like:
/dev/bus/0 -d megaraid,0 # /dev/bus/0 [megaraid_disk_00], SCSI device
/dev/bus/0 -d megaraid,1 # /dev/bus/0 [megaraid_disk_01], SCSI device
So if we use the name like it is done in the current version we will have the second /dev/bus/0
that will overwrite the first one. So in XO we will see information and health but at least info will only be the last device... But maybe it is ok. Otherwise I think about using a key like /dev/bus/0@megaraid,1
. I'm currently checking with XO team what can be done. For now just use device["name"]
I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With my old program that works with increments of i, it worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give me the output of smartctl --scan
? With this I will be able to get the equivalent of what is returned by _list_devices()
and I will be able to run some unittests.
When you say it works you try using xe host-call-plugin host-uuid=<HOSTUUID> plugin=smartctl.py fn=information
and xe host-call-plugin host-uuid=<HOSTUUID> plugin=smartctl.py fn=health
or just by looking the result in XO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd = run_command(["smartctl", "-j", "-H", disk]) | ||
devices = _list_devices() | ||
for device in devices: | ||
cmd = run_command(["smartctl", "-j", "-H", "-d", device['name'], device['type']], check=False) | ||
json_output = json.loads(cmd['stdout']) | ||
if json_output['smart_status']['passed']: | ||
results[disk] = "PASSED" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disk
looks wrong... should be device["name"]
probably ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will test that immediatly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that you can give a try using this one smartctl.py ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I found the issue. This new version of smartctl.py should work better. Can you give it a try?
And thanks a lot for your help and patience for testing it 👍 I don't have such hardware myself so it is really appreciate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, I have no problem testing the code, moreover it only takes two commands 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the last version should work 🤞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1234Erwan , Have you had the opportunity to test the latest version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gthvn1, I pulled your latest smartctl.py and gave it a spin. It runs successfully on my host (XO reports All disks are healthy
; no errors in /var/log/xensource.log
)
For context:
# smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/sdc -d scsi # /dev/sdc, SCSI device
/dev/sdd -d scsi # /dev/sdd, SCSI device
/dev/bus/0 -d megaraid,4 # /dev/bus/0 [megaraid_disk_04], SCSI device
/dev/bus/0 -d megaraid,5 # /dev/bus/0 [megaraid_disk_05], SCSI device
/dev/bus/0 -d megaraid,6 # /dev/bus/0 [megaraid_disk_06], SCSI device
/dev/bus/0 -d megaraid,7 # /dev/bus/0 [megaraid_disk_07], SCSI device
Try to add hard raid support for smartctl.py