Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Register
  • Sign in
  • N netatmoqc
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2
    • Issues 2
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • iObs
  • WP2
  • Task 2.3
  • netatmoqc
  • Issues
  • #5
Closed
Open
Issue created Mar 31, 2021 by Matias Wargelin@MWargelinOwner

Problem with iterative outlier removal method

I ran into an issue trying out alternative clustering and outlier removal methods. Using clustering methods "optics" or "dbscan" with otherwise default configurations runs into the following error in both cases:

config.toml:

[general]
data_rootdir = "/tmp/data_rootdir"
outdir = "/tmp/outdir"
dtgs.start = "2021030500"
dtgs.end = "2021030521"
clustering_method = "optics" # or "dbscan"

error:

Reading config file /tmp/config.toml
[36mDTG=2021-03-05T00 UTC[0m: Started
Traceback (most recent call last):
   File "/usr/local/airflow/.local/bin/netatmoqc", line 8, in <module>
     sys.exit(main())
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/netatmoqc/main.py", line 28, in main
     args.func(args)
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/netatmoqc/commands_functions.py", line 197, in select_stations
     for dtg in config.general.dtgs
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/joblib/parallel.py", line 1041, in __call__
     if self.dispatch_one_batch(iterator):
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/joblib/parallel.py", line 859, in dispatch_one_batch
     self._dispatch(tasks)
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/joblib/parallel.py", line 777, in _dispatch
     job = self._backend.apply_async(batch, callback=cb)
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
     result = ImmediateResult(func)
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 572, in __init__
     self.results = batch()
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/joblib/parallel.py", line 263, in __call__
     for func, args, kwargs in self.items]
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/joblib/parallel.py", line 263, in <listcomp>
     for func, args, kwargs in self.items]
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/netatmoqc/commands_functions.py", line 148, in _select_stations_single_dtg
     df=df, config=config, n_jobs=cpu_share, calc_silhouette_samples=False,
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/netatmoqc/clustering.py", line 524, in cluster_netatmo_obs
     df=df_sub, config=config, **pre_clustering_kwargs
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/netatmoqc/clustering.py", line 452, in _cluster_netatmo_obs_one_domain
     **kwargs,
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/netatmoqc/clustering.py", line 339, in run_clustering_on_df
     reclustering_function=self_consistent_reclustering,
   File "/usr/local/airflow/.local/lib/python3.6/site-packages/netatmoqc/outlier_removal.py", line 364, in filter_outliers
     rtn = filter_outliers_iterative(df, **kwargs)
 TypeError: filter_outliers_iterative() got an unexpected keyword argument 'method'

However, at least when using the "optics" clustering method, adding the following to config.toml makes the error disappear, so my guess would be that the problem is in the default iterative method:

[clustering_method.optics.outlier_removal]
        method = "lof"
Edited Mar 31, 2021 by Matias Wargelin
Assignee
Assign to
Time tracking