-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #15 from UCA-Datalab/develop
Develop
- Loading branch information
Showing
14 changed files
with
905 additions
and
96 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,7 @@ notebooks/ | |
plots/ | ||
output*/ | ||
*.csv | ||
*.ipynb | ||
|
||
# Log | ||
log.out | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,67 @@ | ||
# NILM: classification VS regression | ||
<!-- README template: https://github.com/othneildrew/Best-README-Template --> | ||
|
||
<!-- PROJECT SHIELDS --> | ||
<!-- | ||
*** I'm using markdown "reference style" links for readability. | ||
*** Reference links are enclosed in brackets [ ] instead of parentheses ( ). | ||
*** See the bottom of this document for the declaration of the reference variables | ||
*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use. | ||
*** https://www.markdownguide.org/basic-syntax/#reference-style-links | ||
--> | ||
[![Contributors][contributors-shield]][contributors-url] | ||
[![Forks][forks-shield]][forks-url] | ||
[![Stargazers][stars-shield]][stars-url] | ||
[![Issues][issues-shield]][issues-url] | ||
[![LinkedIn][linkedin-shield]][linkedin-url] | ||
|
||
<!-- PROJECT LOGO --> | ||
<br /> | ||
<p align="center"> | ||
<a href="https://github.com/UCA-Datalab"> | ||
<img src="images/logo.png" alt="Logo" width="400" height="80"> | ||
</a> | ||
|
||
<h3 align="center">NILM: classification VS regression</h3> | ||
</p> | ||
|
||
|
||
<!-- TABLE OF CONTENTS --> | ||
<details open="open"> | ||
<summary>Table of Contents</summary> | ||
<ol> | ||
<li> | ||
<a href="#about-the-project">About The Project</a> | ||
</li> | ||
<li> | ||
<a href="#getting-started">Getting Started</a> | ||
<ul> | ||
<li><a href="#create-the-environment">Create the Environment</a></li> | ||
</ul> | ||
</li> | ||
<li> | ||
<a href="#datasets">Datasets</a> | ||
<ul> | ||
<li><a href="#uk-dale">UK-DALE</a></li> | ||
</ul> | ||
<ul> | ||
<li><a href="#pecan-street-dataport">Pecan Street Dataport</a></li> | ||
</ul> | ||
<li><a href="#preprocess-the-data">Preprocess the Data</a></li> | ||
</li> | ||
<li> | ||
<a href="#train">Train</a> | ||
<ul> | ||
<li><a href="#reproduce-the-paper">Reproduce the Paper</a></li> | ||
<li><a href="#thresholding-methods">Thresholding Methods</a></li> | ||
</ul> | ||
</li> | ||
<li><a href="#publications">Publications</a></li> | ||
<li><a href="#contact">Contact</a></li> | ||
<li><a href="#acknowledgements">Acknowledgements</a></li> | ||
</ol> | ||
</details> | ||
|
||
## About the project | ||
|
||
Non-Intrusive Load Monitoring (NILM) aims to predict the status | ||
or consumption of domestic appliances in a household only by knowing | ||
|
@@ -14,10 +77,10 @@ deep learning state-of-the-art architectures on both the regression and | |
classification problems, introducing criteria to select the most convenient | ||
thresholding method. | ||
|
||
Source: [see publications](#publications) | ||
## Getting started | ||
### Create the Environment | ||
|
||
## Set up | ||
### Create the environment using Conda | ||
To create the environment using Conda: | ||
|
||
1. Install miniconda | ||
|
||
|
@@ -43,12 +106,10 @@ Source: [see publications](#publications) | |
conda activate nilm-thresholding | ||
``` | ||
## Data | ||
## Datasets | ||
### UK-DALE | ||
#### Download UK-DALE | ||
UK-DALE dataset is hosted on the following link: | ||
[https://data.ukedc.rl.ac.uk/browse/edc/efficiency/residential | ||
/EnergyConsumption/Domestic/UK-DALE-2017/UK-DALE-FULL-disaggregated](https://data.ukedc.rl.ac.uk/browse/edc/efficiency/residential/EnergyConsumption/Domestic/UK-DALE-2017/UK-DALE-FULL-disaggregated) | ||
|
@@ -69,7 +130,15 @@ nilm-thresholding | |
Credit: [Jack Kelly](https://jack-kelly.com/data/) | ||
### Preprocess | ||
### Pecan Street Dataport | ||
We are aiming to include this dataset in a future release. You can check the issue here: [https://github.com/UCA-Datalab/nilm-thresholding/issues/8](https://github.com/UCA-Datalab/nilm-thresholding/issues/8) | ||
Any help and suggestions are welcome! | ||
Credit: [Pecan Street](https://dataport.pecanstreet.org/) | ||
## Preprocess the Data | ||
Once downloaded the raw data from any of the sources above, | ||
you must preprocess it. | ||
|
@@ -106,23 +175,23 @@ If you want to use your own set of parameters, duplicate the aforementioned | |
configuration file and modify the paremeters you want to change (without deleting any | ||
parameter). You can then use that config file with the following command: | ||
``` | ||
``` | ||
python nilmth/train.py --path_config <path to your config file> | ||
``` | ||
For more information about the script, run: | ||
``` | ||
``` | ||
python nilmth/train.py --help | ||
``` | ||
Once the models are trained, test them with: | ||
``` | ||
``` | ||
python nilmth/test.py --path_config <path to your config file> | ||
``` | ||
#### Reproduce paper | ||
### Reproduce the Paper | ||
To reproduce the results shown in [our paper](#publications), activate the | ||
environment and then run: | ||
|
@@ -136,11 +205,11 @@ models are stored. Then, the script `train.py` will be called, using each | |
configuration each. This will store the model weights, which will be used | ||
again during the test phase: | ||
``` | ||
``` | ||
nohup sh test_sequential.sh > log.out & | ||
``` | ||
### Thresholding methods | ||
### Thresholding Methods | ||
There are three threshold methods available. Read [our paper](#publications) | ||
to understand how each threshold works. | ||
|
@@ -151,13 +220,32 @@ to understand how each threshold works. | |
## Publications | ||
[NILM as a regression versus classification problem: | ||
* [NILM as a regression versus classification problem: | ||
the importance of thresholding](https://www.researchgate.net/project/Non-Intrusive-Load-Monitoring-6) | ||
## Contact information | ||
## Contact | ||
Daniel Precioso - [daniprec](https://github.com/daniprec) - [email protected] | ||
Project link: [https://github.com/UCA-Datalab/nilm-thresholding](https://github.com/UCA-Datalab/nilm-thresholding) | ||
ResearhGate link: [https://www.researchgate.net/project/NILM-classification-VS-regression](https://www.researchgate.net/project/NILM-classification-VS-regression) | ||
## Acknowledgements | ||
* [UCA DataLab](http://datalab.uca.es/) | ||
* [David Gómez-Ullate](https://www.linkedin.com/in/david-g%C3%B3mez-ullate-oteiza-87a820b/?originalSubdomain=en) | ||
Author: Daniel Precioso, PhD student at Universidad de Cádiz | ||
- Email: [email protected] | ||
- [Github](https://github.com/daniprec) | ||
- [LinkedIn](https://www.linkedin.com/in/daniel-precioso-garcelan/) | ||
- [ResearchGate](https://www.researchgate.net/profile/Daniel_Precioso_Garcelan) | ||
<!-- MARKDOWN LINKS & IMAGES --> | ||
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links --> | ||
[contributors-shield]: https://img.shields.io/github/contributors/UCA-Datalab/nilm-thresholding.svg?style=for-the-badge | ||
[contributors-url]: https://github.com/UCA-Datalab/nilm-thresholding/graphs/contributors | ||
[forks-shield]: https://img.shields.io/github/forks/UCA-Datalab/nilm-thresholding.svg?style=for-the-badge | ||
[forks-url]: https://github.com/UCA-Datalab/nilm-thresholding/network/members | ||
[stars-shield]: https://img.shields.io/github/stars/UCA-Datalab/nilm-thresholding.svg?style=for-the-badge | ||
[stars-url]: https://github.com/UCA-Datalab/nilm-thresholding/stargazers | ||
[issues-shield]: https://img.shields.io/github/issues/UCA-Datalab/nilm-thresholding.svg?style=for-the-badge | ||
[issues-url]: https://github.com/UCA-Datalab/nilm-thresholding/issues | ||
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555 | ||
[linkedin-url]: https://www.linkedin.com/in/daniel-precioso-garcelan/ |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
import itertools | ||
from typing import Optional, Tuple | ||
|
||
import matplotlib.pyplot as plt | ||
import numpy as np | ||
from scipy.cluster.hierarchy import cophenet, dendrogram, fcluster, linkage | ||
from scipy.spatial.distance import pdist | ||
|
||
|
||
class HierarchicalClustering: | ||
def __init__( | ||
self, distance: str = "average", n_cluster: int = 2, criterion: str = "maxclust" | ||
): | ||
"""This object is able to perform Hierarchical Clustering on a given set of points | ||
Parameters | ||
---------- | ||
distance : str, optional | ||
Clustering distance criteria, by default "average" | ||
n_cluster : int, optional | ||
Number of clusters to form, by default 2 | ||
criterion : str, optional | ||
Criterion used to compute the clusters, by default "maxclust" | ||
""" | ||
self.distance = distance | ||
self.n_cluster = n_cluster | ||
self.criterion = criterion | ||
|
||
# Attributes filled with `perform_clustering` | ||
self.x = np.empty(0) # Set of data points | ||
self.z = np.empty(0) # The hierarchical clustering encoded as a linkage matrix | ||
# z[i] will tell us which clusters were merged in the i-th iteration | ||
|
||
# Attributes filled with `plot_dendogram` | ||
self.dendrogram = {} | ||
# A dictionary of data structures computed to render the dendrogram | ||
|
||
# Attributes filled with `compute_thresholds_and_centroids` | ||
self.thresh = np.empty(0) | ||
self.centroids = np.empty(0) | ||
|
||
def perform_clustering( | ||
self, ser: np.array, distance: Optional[str] = None | ||
) -> np.array: | ||
"""Performs the actual clustering, using the linkage function | ||
Parameters | ||
---------- | ||
ser : np.array | ||
Series of points to group in clusters | ||
distance : str, optional | ||
Clustering distance criteria, by default None (takes the one from the class) | ||
""" | ||
self.distance = distance if distance is not None else self.distance | ||
# The shape of our X matrix must be (n, m) | ||
# n = samples, m = features | ||
self.x = np.expand_dims(ser, axis=1) | ||
self.z = linkage(self.x, method=self.distance) | ||
|
||
@property | ||
def cophenet(self): | ||
# Cophenet correlation coefficient | ||
c, coph_dists = cophenet(self.z, pdist(self.x)) | ||
return c | ||
|
||
def plot_dendrogram( | ||
self, p: int = 6, max_d: Optional[float] = None, figsize: Tuple[int] = (3, 3) | ||
): | ||
"""Plots the dendrogram | ||
Parameters | ||
---------- | ||
p : int, optional | ||
Last split, by default 6 | ||
max_d : Optional[float], optional | ||
Maximum distance between splits, by default None | ||
figsize : Tuple[int], optional | ||
Figure size, by default (3, 3) | ||
""" | ||
fig, ax = plt.subplots(figsize=figsize) | ||
self.dendrogram = dendrogram( | ||
self.z, | ||
p=p, | ||
orientation="right", | ||
truncate_mode="lastp", | ||
labels=self.x[:, 0], | ||
ax=ax, | ||
) | ||
if max_d is not None: | ||
ax.axvline(x=max_d, c="k") | ||
return fig, ax | ||
|
||
@property | ||
def dendrogram_distance(self): | ||
return sorted(set(itertools.chain(*self.dendrogram["dcoord"])), reverse=True) | ||
|
||
def plot_dendrogram_distance(self, figsize: Tuple[int] = (10, 3)): | ||
"""Plots the dendrogram distances | ||
Parameters | ||
---------- | ||
figsize : Tuple[int], optional | ||
Size of the figure, by default (10, 3) | ||
""" | ||
# Initialize plots | ||
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=figsize) | ||
# Dendrogram distance | ||
ax1.scatter( | ||
range(2, len(self.dendrogram_distance) + 1), self.dendrogram_distance[:-1] | ||
) | ||
ax1.set_ylabel("Distance") | ||
ax1.set_xlabel("Number of clusters") | ||
ax1.grid() | ||
# Dendrogram distance difference | ||
diff = np.divide( | ||
-np.diff(self.dendrogram_distance), self.dendrogram_distance[:-1] | ||
) | ||
ax2.scatter(range(3, len(self.dendrogram_distance) + 1), diff[:-1]) | ||
ax2.set_ylabel("Gradient") | ||
ax2.set_xlabel("Number of clusters") | ||
ax2.grid() | ||
return fig, (ax1, ax2) | ||
|
||
def compute_thresholds_and_centroids( | ||
self, | ||
n_cluster: Optional[int] = None, | ||
criterion: Optional[str] = None, | ||
centroid: str = "median", | ||
): | ||
"""Computes the thresholds and centroids of each group | ||
Parameters | ||
---------- | ||
n_cluster : Optional[int], optional | ||
Number of clusters, by default None | ||
criterion : Optional[str], optional | ||
Criterion used to compute the clusters, by default None | ||
centroid : str, optional | ||
Method to compute the centroids (median or mean), by default "median" | ||
""" | ||
self.n_cluster = n_cluster if n_cluster is not None else self.n_cluster | ||
self.criterion = criterion if criterion is not None else self.criterion | ||
clusters = fcluster(self.z, self.n_cluster, self.criterion) | ||
# Get centroids | ||
if centroid == "median": | ||
fun = np.median | ||
elif centroid == "mean": | ||
fun = np.mean | ||
self.centroids = np.array( | ||
sorted([fun(self.x[clusters == (c + 1)]) for c in range(self.n_cluster)]) | ||
) | ||
# Sort clusters by power | ||
x_max = sorted( | ||
[np.max(self.x[clusters == (c + 1)]) for c in range(self.n_cluster)] | ||
) | ||
x_min = sorted( | ||
[np.min(self.x[clusters == (c + 1)]) for c in range(self.n_cluster)] | ||
) | ||
thresh = np.divide(np.array(x_min[1:]) + np.array(x_max[:-1]), 2) | ||
self.thresh = np.insert(thresh, 0, 0, axis=0) |
Oops, something went wrong.