Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LiveSurvey Class #264

Draft
wants to merge 83 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 79 commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
b56b618
Establishing inital `LiveSurvey`
brandynlucca Jul 15, 2024
00c898d
Initial data loading function refactoring
brandynlucca Jul 15, 2024
69340e7
Updated methods
brandynlucca Jul 16, 2024
3adf521
Updated methods/processing (plus SQL)
brandynlucca Jul 20, 2024
9b79d81
Updating `LiveSurvey` methods
brandynlucca Jul 22, 2024
382d444
General changes
brandynlucca Jul 22, 2024
2374ce9
Commited SQL changes
brandynlucca Jul 25, 2024
7f49f31
Reorganize loading functions
brandynlucca Jul 25, 2024
6d439bb
General changes
brandynlucca Aug 1, 2024
a4a51a6
Format some changes to methods
brandynlucca Aug 1, 2024
e395405
Quick patch
brandynlucca Aug 1, 2024
2f04cab
Fleshed out biology processing methods
brandynlucca Aug 1, 2024
c95cf8d
Further refinement of `process_biology_data` meth
brandynlucca Aug 1, 2024
4e4ca87
Complete biology processing code
brandynlucca Aug 2, 2024
d0f4208
More changes to methods
brandynlucca Aug 5, 2024
6020f79
Full drafted workflow
brandynlucca Aug 7, 2024
1ee2e20
Patches
brandynlucca Aug 8, 2024
0f31b20
Cleaned up `test_workflow`
brandynlucca Aug 8, 2024
40f3d7b
YAML config settings adjustment for db dir
brandynlucca Aug 12, 2024
63e7961
f-string fix for coastline db file creation
brandynlucca Aug 12, 2024
ab6d9ff
Fix to stratum/spatial config key name
brandynlucca Aug 12, 2024
8dd470c
Fix to database directory initialization
brandynlucca Aug 12, 2024
b6fbae5
Additional db directorypath changes/fixes
brandynlucca Aug 12, 2024
6c6214f
Fix `data_root_dir` missing workaround
brandynlucca Aug 12, 2024
d1bdc2c
db pathing issues fixed
brandynlucca Aug 12, 2024
3e252a0
`data_root_dir` check for `read_biology_files`
brandynlucca Aug 12, 2024
a1cec01
Gridding methods
brandynlucca Aug 13, 2024
dd87bc9
Grid fix
brandynlucca Aug 13, 2024
8693641
Add `xarray` kwargs options
brandynlucca Aug 14, 2024
6af7aa5
Merge branch 'WIP_LiveSurvey_class' of https://github.com/brandynlucc…
brandynlucca Aug 14, 2024
faed21a
`pandas` kwargs storage options
brandynlucca Aug 14, 2024
af93851
`xarray_kwargs` patch
brandynlucca Aug 14, 2024
0b13fa7
Disable file/directory existence checker
brandynlucca Aug 14, 2024
9e9ae07
Remove `Path` typing for acoustic zarr input
brandynlucca Aug 14, 2024
a46ccac
Attempts pathing fixes
brandynlucca Aug 14, 2024
48dd27a
More Path removal changes
brandynlucca Aug 14, 2024
7c5d38e
More Path removal
brandynlucca Aug 14, 2024
152b703
Coastline db update fixes (pathing)
brandynlucca Aug 14, 2024
c75be73
Add `storage_options` input for `pygrio.read_file`
brandynlucca Aug 14, 2024
755c9cf
Fix to `storage_options` arg for `geopandas`
brandynlucca Aug 14, 2024
9fc2493
Updated `pygrio` engine settings
brandynlucca Aug 14, 2024
1c6c81a
Fixed random/inconsistent column key missing
brandynlucca Aug 14, 2024
95901d3
Change files read/processed tracking
brandynlucca Aug 14, 2024
10ebb88
Add file read checkpointing (`load_biology_data`)
brandynlucca Aug 14, 2024
3e1d060
fix to `read_csv`
brandynlucca Aug 15, 2024
5ed0ce2
Fix glob cmd
brandynlucca Aug 15, 2024
c6697d2
Index fix
brandynlucca Aug 15, 2024
c7d2244
Fixed methods for s3 bucket
brandynlucca Aug 15, 2024
89880bb
Removed f-string
brandynlucca Aug 15, 2024
7244246
`live_visualizer` module
brandynlucca Aug 15, 2024
76085c3
Minor changes to axis labels
brandynlucca Aug 15, 2024
e1ec7a0
Plotting function for bio distirbutions
brandynlucca Aug 15, 2024
b36e284
Possible fix to `BIGINT` SQL error
brandynlucca Aug 16, 2024
c761a9b
Validator for successful population run
brandynlucca Aug 16, 2024
0f1e8f2
Fix to population validation handshake
brandynlucca Aug 16, 2024
370b16f
Database pathing changes
brandynlucca Aug 16, 2024
aca08e4
Fixes to oddities due to `NaN` for cruise plot
brandynlucca Aug 16, 2024
eaec504
Updated plotting method for `None`
brandynlucca Aug 16, 2024
0924489
Matplotlib to panel update
brandynlucca Aug 16, 2024
1a90186
Panel naming update
brandynlucca Aug 17, 2024
32c0b99
Cleaned up `test_workflow`
brandynlucca Aug 17, 2024
65ab70d
Changed dynamic colorrange for some plots
brandynlucca Aug 17, 2024
27ff2d3
Fix to grid plot colormap scaling/range
brandynlucca Aug 19, 2024
8e21e13
Add dataset validator for biodata
brandynlucca Aug 19, 2024
0d8e732
Apply biodata validator only to biodata...
brandynlucca Aug 19, 2024
3ab32ac
Fix to biodata dataset validator
brandynlucca Aug 20, 2024
80d2b55
f-string adjustment
brandynlucca Aug 20, 2024
dd24ebd
Change to enable multiple ship data sources
brandynlucca Aug 21, 2024
3c4830a
Fix to cases where lon/lat/ping_time were NaN/NaT
brandynlucca Aug 21, 2024
4bce560
f-string fix for sql_methods
brandynlucca Aug 21, 2024
50937f1
Fixed `ship_id` f-string issue
brandynlucca Aug 21, 2024
f324647
Fixes to odd SQL table column shuffling
brandynlucca Aug 22, 2024
f0f8001
Minor improvements to visualizer code
brandynlucca Aug 22, 2024
5fcc0c9
New configuration file validator
brandynlucca Aug 22, 2024
7db9764
Data reading validators
brandynlucca Aug 22, 2024
cef3036
Clarified config validation error messages
brandynlucca Aug 26, 2024
0aa88ac
Pre-commit formatting changes
brandynlucca Aug 26, 2024
218df8a
Pruned `test_workflow.py`
brandynlucca Aug 28, 2024
5a0eac8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 28, 2024
2a277a6
add echopop live viz cmap and fig seq tweaks
Sohambutala Oct 27, 2024
30c05ff
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 27, 2024
99a91a7
tweak zorder of scatter and line in track plot, tweak units
Sohambutala Oct 27, 2024
25ee616
Merge branch 'WIP_LiveSurvey_class' of https://github.com/brandynlucc…
Sohambutala Oct 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions config_files/live_initialization_config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# This YAML file is a configuration file for all
# initialization parameters used for the `LiveSurvey`
# class in Echopop

---
#####################################################################################################################
# Biological data processing#
########################
biology:
# Length-binning
# NOTE: start : end : number
length_distribution:
bins: [2, 80, 40]
# Station separation
# NOTE: if `separate_stations` is True, `['list']` is required for `station_id`
stations:
separate_stations: True
station_id: ["length", "specimen"]
# Trawl identifier
catch:
partition: codend

#####################################################################################################################
# Geospatial settings#
########################
geospatial:
inpfc: # INPFC northern latitude limits and labels
latitude_max: [36.0, 40.5, 43.0,
45.7667, 48.50, 55.0]
stratum_names: [1, 2, 3, 4, 5, 6]
griddify:
# Coordinate bounds
bounds:
latitude: [32.75, 55.50]
longitude: [-135.25, -117.00]
# x/y (or E-W/N-S) grid resolution in nmi
grid_resolution:
x_distance: 25.0
y_distance: 25.0
projection: epsg:4326 # EPSG integer code for geodetic parameter dataset
# TODO: Remember to convert this back to a string
# NOTE: `link_biology_acoustics` defines how biological and acoustic data are linked with one another. This
# comprises True/False statements that denote the desired association. All values set to "True" will be output.
# `global` --> NASC associated with sigma_bs calculated from all survey data
# `INPFC` --> NASC for each INPFC stratum associated with matched stratum-specific sigma_bs
# `closest_haul` --> NASC associated with sigma_bs calculated from the closest (spatially) trawls
# `weighted_haul` --> NASC associated with sigma_bs calculated from all survey data weighted by distance from haul coordinates
link_biology_acoustics: INPFC

#####################################################################################################################
# Acoustics settings#
########################
acoustics:
# Acoustic transmit frequency (Hz or kHz)
transmit:
frequency: 38.0
units: kHz
# Target strength (TS) - length (L) regression: TS=m*log10(L)+b
TS_length_regression_parameters:
pacific_hake: # corresponding species text code
number_code: 22500 # species number code
TS_L_slope: 20.0 # the 'm' or 'slope' parameter
TS_L_intercept: -68.0 # the 'b' or 'y-intercept'
length_units: cm # units for L used in regression/relationship
...
54 changes: 54 additions & 0 deletions config_files/live_survey_year_2019_config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# This YAML file is a configuration file specifying
# input filenames & some process parameter settings.
# Relative file paths defined below are concatenated
# with the data_root_dir path also set below.

---
##############################################################################
# Parameters

ship_id: R/V Shimada
survey_year: 2024 # survey year being considered
species:
text_code: pacific_hake # target species for the survey year -- species name
number_code: 22500 # target species for the survey year -- numeric code
##############################################################################
# Directory path that contains all input data needed

data_root_dir: C:/Users/Brandyn/Documents/GitHub/EchoPro_data/live_2019_files
database_directory: C:/Users/Brandyn/Documents/GitHub/EchoPro_data/live_2019_files/database

##############################################################################
# Input data directories
input_directories:
acoustics:
directory: acoustics/
database_name: acoustics.db
extension: zarr
biology:
directory: biology/
# directory: s3://sh2407-upload/data/Echopop-biology/
database_name: biology.db
extension: csv
file_name_formats:
catch: "{DATE:YYYYMM}_{HAUL}_{FILE_ID:catch_perc}"
length: "{DATE:YYYYMM}_{SPECIES_CODE}_{HAUL}_{FILE_ID:lf}"
specimen: "{DATE:YYYYMM}_{SPECIES_CODE}_{HAUL}_{FILE_ID:spec}"
trawl_info: "{DATE:YYYYMM}_{HAUL}_{FILE_ID:operation_info}"
file_index:
catch: [haul_num]
length: [haul_num, species_id]
specimen: [haul_num, species_id]
trawl_info: []
file_ids:
catch: catch_perc
length: lf
specimen: spec
trawl_info: operation_info
coastline:
directory: coastline/
coastline_name: ne_10m_land
grid:
database_name: grid.db

...
2 changes: 1 addition & 1 deletion echopop/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@

__all__ = ["Survey", "operations"]

from _echopop_version import version as __version__ # noqa
# from _echopop_version import version as __version__ # noqa
2 changes: 1 addition & 1 deletion echopop/biology.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def fit_length_weight_relationship(
np.polyfit(np.log10(df["length"]), np.log10(df["weight"]), 1),
index=["rate", "initial"],
),
include_groups=False,
# include_groups=False,
)
.reset_index()
)
Expand Down
5 changes: 5 additions & 0 deletions echopop/live/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from echopop.utils import operations

__all__ = ["operations"]

# from _echopop_version import version as __version__ # noqa
Loading
Loading