ETL tool
extract form sheets, transform: groupby load: looker studio ext
small steps automation steps no code platform
visual
Nodes:
- Extract: read organse
- Transform yello
- Load red
- brown
row filters
Analytical nodes: custom logic rule engine node
work flow annotations: boxes and labeling to group nodes readability
Extensions?
automation?
An open source tool for data exploration.
Related:
Visual workflows are useful for agentic systems
Use workflows to ensure language in text docuements are of a given standard.
Types of workflows
Terminology updater
Handling Imbalanced Data
E.g Identifiying fraud: happens rarely
Uneven mix of data
Class seperation
High accuracy
What do the evaluation metrics say
High accuracy in prediction for majority but poor for minority class.
Patterns appear most heavily in the majority class.
Class-Imbalance Problem:
aim to ensure that the prediction algorithm pays equal attention to the patterns presented by all classes—majority, minority Possible sol: change the distrubiton in the training data
Data Sampling Methods:
- Oversampling: increase the number of minority class
- SMOTE
- Undersampling
- Boot straping
- Combinations
Cost Sensitive Methods: focusing on the cost of making mistakes – the cost of misclassifications.
- unequal cositing, threshold based
Algorithimic methods : SVM,Decisaino trees,clustering one classed based
Ensemble Methods: bagging boosting active learning