Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease in AUC Score After Standardization of Node and Edge Features in Custom Graph Dataset #109

Open
Naotake-Ishikura opened this issue Aug 13, 2024 · 0 comments

Comments

@Naotake-Ishikura
Copy link

I tried using DOMINANT for anomaly detection on a custom dataset. This dataset consists of undirected graphs, each with one node feature and one edge feature. The graphs range from a minimum of 3 nodes to a maximum of approximately 1000 nodes.

When training DOMINANT with this custom dataset, I observed a decrease in the AUC score for anomalous graphs when standardizing the node and edge features during preprocessing, compared to not standardizing them. I had expected that standardization would stabilize the features by unifying their scales, so I'm unsure why the AUC score decreased.

Additionally, I found that after standardization, if I add 2 to all values of the node and edge features (to adjust the minimum value to be above zero), the AUC score for anomalous graphs improves (resulting in an AUC score similar to that before standardization).

If you have any insights into the cause or how to further investigate this issue, I would appreciate your advice.

Below are the parameter settings for DOMINANT and DataLoader, which are mostly set to their default values.

DOMINANT:

epoch = 10
DataLoader:

batch_size = 4
shuffle = False
drop_last = True
pin_memory = True
num_workers = 2

Note: I'm using the DataLoader from torch_geometric.loader and training for 50 epochs. Since the anomaly detection is done on a graph level, I'm using the pygod.utils.to_graph_score function to calculate the anomaly scores, and the AUC score is used as the evaluation metric for these anomaly scores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant