Data Archive
Search
Search
Dark mode
Light mode
Explorer
pages
Data Archive
DE_Tools
ML_Tools
Queries
Questions
Quotes
Tags
standardised
1-on-1 Template
AB testing
Accessing Gen AI generated content
Accuracy
ACID Transaction
Activation atlases
Activation Function
Active Learning
Ada boosting
Adam Optimizer
Adaptive Learning Rates
Addressing Multicollinearity
Addressing_Multicollinearity.py
Adjusted R squared
Agent-based modelling
Agentic Solutions
AI Engineer
AI governance
Algorithms
Amazon S3
Anomaly Detection
Anomaly Detection in Time Series
Anomaly Detection with Clustering
Anomaly Detection with Statistical Methods
Apache Kafka
API
API Driven Microservices
ARIMA
Attention Is All You Need
Attention mechanism
AUC
Automated Feature Creation
AWS Lambda
Azure
Backpropagation in Neural Networks
Bag of words
Bagging
Bandit example output
Bandit_Example_Fixed.py
Bandit_Example_Nonfixed.py
Baseline Forecasting
Bash
Batch Normalistion
Batch Processing
Bellman Equations
BERT
BERT Pretraining of Deep Bidirectional Transformers for Language Understanding
BERTScore
Bias and variance
Big Data
BigQuery
binary classification
Boosting
Boxplot
Business observability
Career Interest
CatBoost
Central Limit Theorem
Chain of thought
Change Management
Checksum
Choosing a Threshold
Choosing the Number of Clusters
CI-CD
Class Separability
Classification
Classification Report
Claude
cleaning terminal path
Clustering
Clustering_Dashboard.py
Clustermap
Code Diagrams
Columnar Storage
Command line
Command Prompt
Common Security Vulnerabilities in Software Development
Common Table Expression
Communication principles
Communication Techniques
Comparing LLM
Comparing_Ensembles.py
Components of the database
Computer Science
Conceptual Model
Confidence Interval
Confusion Matrix
Continuous Delivery - Deployment
Continuous Integration
Converting categorical variables to a dummy indicators
Convolutional Neural Networks
Correlation
Correlation vs Causation
Cosine Similarity
Cost Function
Cost-Sensitive Analysis
Covariance
Covariance Structures
Covariance vs Correlation
Cross Entropy
Cross validation
Cross_Entropy_Single.py
Cross_Entropy.py
CRUD
Current challenges within the energy sector
Dash
Dashboarding
Data AI Education at Work
Data Analysis
Data Analyst
Data Architect
Data Archive Graph Analysis
data asset
Data Cleansing
Data Collection
Data Drift
Data Engineer
Data Engineering
Data Engineering Portal
Data Ingestion
Data Integrity
Data Leakage
Data Management
Data Mining - CRISP
Data Modelling
Data Orchestration
Data Pipeline
Data Pipeline to Data Products
Data Principles
Data Reduction
Data Roles
Data Science
Data Scientist
Data Selection
Data Steward
Data storage
Data Streaming
Data Validation
data virtualization
Data Visualisation
Database
Database Index
Database Management System (DBMS)
Database schema
Databricks
Databricks vs Snowflake
Datasets
DBScan
dbt
Debugging
Debugging ipynb
Debugging.py
Decision Tree
Deep Learning Frameworks
Deep Learning Overview
Deep Q-Learning
DeepSeek
Demand forecasting
Dendrograms
Determining Threshold Values
Difference between Databricks vs. Snowflake
Difference between snowflake to hadoop
Digital Transformation
Digital twin
Dimension Table
Dimensional Modelling
Dimensionality Reduction
dimensions
Directed Acyclic Graph (DAG)
Directory Structure
Distillation
Distributed Computing
Distributions
Docker
Docker Image
Documentation
Dropout
duckdb
EDA
Edge Machine Learning Models
Education and Training
Elastic Net
ELT
Embedded Methods
emergent behavior
Encoding Categorical Variables
Energy
Energy ABM
Energy Storage
Epoch
ER Diagrams
Estimator
ETL Pipeline example
ETL vs. ELT
etlt
Evaluating Language Models
Evaluation Metrics
Event Driven
Event Driven Events
Event Driven Microservices
Event-Driven Architecture
Everything
Excel & Sheets
Explain different gradient descent algorithms, their advantages, and limitations.
Explain the curse of dimensionality
Exploration
Exploration vs. Exploitation
Exponential Smoothing
F1 Score
Fabric
fact table
Factor Analysis
Factor_Analysis.py
facts
FastAPI
FastAPI_Example.py
Feature Engineering
Feature Evaluation
Feature Extraction
Feature Importance
Feature Scaling
Feature Selection
Feature selection and creation
Feature Selection vs Feature Importance
Feed Forward Neural Network
Feedback Template
Filter method
filter methods
Firebase
Fitting weights and biases of a neural network
Flask
Folder Tree Diagram
Foreign Key
Forward Propagation in Neural Networks
Full Lifecycle Management
Gaussian Distribution
Gaussian Mixture Models
Gaussian Model
gaussian_mixture_model_implementation.py
Generative Adversarial Networks
Generative AI
Generative AI From Theory to Practice
Get data
Gini Impurity
Gini Impurity vs Cross Entropy
GIS
Git
Gitlab
gitlab-ci.yml
Google Cloud Platform
Google My Maps Data Extraction
Gradient Boosting
Gradient Boosting Regressor
Gradient Descent
Gradio
Grain
Grammar method
Graph Analysis Plugin
GraphRAG
Grep
GridSeachCv
Groupby
Groupby vs Crosstab
GRU
GSheets
Guardrails
Hadoop
Handling Different Distributions
Handling Missing Data
Handwritten Digit Classification
Hashes
Heatmap
heterogeneous features
Hierarchical Clustering
High cross validation accuracy is not directly proportional to performance on unseen test data
How businesses use Gen AI
How do we evaluate of LLM Outputs
how do you do the data selection
How is reinforcement learning being combined with deep learning
How is schema evolution done in practice with SQL
How to do git commit messages properly
How to model to improve demand forecasting
How to normalise a merged table
How to reduce the need for Gen AI responses
How to search within a graph
How to use Sklearn Pipeline
Hugging Face
Hyperparameter
Hyperparameter Tuning
Hypothesis testing
Imbalanced Datasets
Imbalanced_Datasets_SMOTE.py
Implementing Database Schema
In NER how would you handle ambiguous entities
incremental synchronization
Industries of interest
inference
inference versus prediction
information theory
Input is Not Properly Sanitized
Interpretability
interview notepad
ipynb
Isolation Forest and Its Use in Anomaly Detection
Java vs JavaScript
Json
Json to Yaml
Junction Tables
Justfile
K_Means.py
K-means
K-nearest neighbours
Kaggle Abalone regression example
Kernelling
Key Differences of Web Feature Server (WFS) and Web Feature Server (WFS)
Kmeans vs GMM
Knowledge Graph
Knowledge graph vs RAG setup
Knowledge Graphs with Obsidian
Knowledge Work
Labelling data
Language Model Output Optimisation
Language Models
Language Models Large (LLMs) vs Small (SLMs)
Lasso
Latency
LBFGS
learning rate
Learning Styles
lemmatization
LightGBM
LightGBM vs XGBoost vs CatBoost
Linear Discriminant Analysis
Linear Regression
LLM
LLM Evaluation Metrics
Load Balancing
Local Interpretable Model-agnostic Explanations
Logical Model
Logistic Regression
Logistic regression in sklearn & Gradient Descent
loss function
Loss versus Cost function
LSTM
Machine Learning Algorithms
Machine Learning Operations
Machine Learning Workflow
Maintainable Code
Makefile
Manifold learning
Many-to-Many Relationships
Markov chain
Markov Decision Processes
Mathematical Reasoning in Transformers
mean absolute error
Mean Squared Error
melt
Memory
Memory Caching
Mermaid
Metadata Handling
Methods for Handling Outliers
Microsoft Access
Mini-batch gradient descent
Mixture of Experts
ML Engineer
MNIST
Model Building
Model Cascading
Model Deployment
Model Ensemble
Model Evaluation
Model Evaluation vs Model Optimisation
Model Interpretability
Model Observability
Model Optimisation
Model Parameters
Model Parameters Tuning
Model parameters vs hyperparameters
Model preparation
Model Selection
Model Validation
Momentum
Momentum.py
MongoDB
Monolith Architecture
Multi-Agent Reinforcement Learning
Multi-head attention
Multicollinearity
Multinomial Naive bayes
MySql
Naive Bayes
Natural Language Processing
Network Design
Neural network
Neural Network Classification
Neural network in Practice
Neural Scaling Laws
Ngrams
nltk
Non-parametric tests
Normalisation
Normalisation of data
Normalisation of Text
Normalisation vs Standardisation
NoSQL
NotebookLM
npy Files A NumPy Array storage
OLTP
oltp (online transactional processing)
One Pager Template
One-hot encoding
Optimisation function
Optimisation techniques
Optimising a Logistic Regression Model
Optimising Neural Networks
Optuna
Ordinary Least Squares
Orthogonalization
Outliers
Overfitting in Machine Learning
p values
p-values in linear regression in sklearn
Pandas
Parametric tests
parametric vs non-parametric models
parametric vs non-parametric tests
Parquet
parsimonious
Part of speech tagging
PCA Explained Variance Ratio
pdoc
PDP and ICE
Performance Drift in Machine Learning
Physical Model
PostgresSQL
PowerBI
Powerquery
PowerShell
Powershell versus cmd
Powershell vs Bash
Precision
Precision or Recall
Precision-Recall Curve
Preprocessing
Primary Key
Principal Component Analysis
Probability in other fields
Problem Definition
programming languages
Prompt engineering
Prompt Extracting information from blog posts
Prompting
Publish and Subscribe
Pull Request Template
PyCaret
Pycaret_Example.py
Pydantic
Pydantic_More.py
Pydantic.py
Pyright
Pyright vs Pydantic
PySpark
Pytest
Python
PyTorch
Pytorch vs Tensorflow
Q-Learning
Quartz
QUERY GSheets
Query Optimisation
Querying
R squared
R-squared metric not always a good indicator of model performance in regression
RAG
Random Forest Regression
Random Forests
React
Reasoning tokens
Recall
Recommender systems
Recurrent Neural Networks
Regression Analysis and its Applications
Regression metrics
Regularisation of Tree based models
Regularisation.py
Regularization in Machine Learning
Reinforcement learning
Relating Tables Together
Relationships in memory
REST API
Reward Function
Ridge
ROC (Receiver Operating Characteristic)
ROC_Curve.py
rollup
Row-based Storage
Sarsa
Scala
Scalability
Scaling Agentic Systems
Scaling databases
Scaling Server
Scientific Method
Search
Security
semantic layer
Semantic Relationships
Sentence Similarity
shapefile
SHapley Additive exPlanations
Sharepoint
Silhouette Analysis
Single source of truth
Sklearn
sklearn datasets
Sklearn Pipiline
Small Language Models
Smart Grids
SMOTE (Synthetic Minority Over-sampling Technique)
SMSS
Snowflake
Snowflake Schema
Software Development
Software Development Life Cycle
SparseCategorialCrossentropy or CategoricalCrossEntropy
Specificity
Spreadsheets to Databases
SQL vs NoSQL
SQLite
Stack
Stacking
Standard deviation
Standardisation
Star Schema
Statistical Assumptions
Statistical Tests
Statistics
Stemming
Stochastic Gradient Descent
Strongly vs Weakly typed language
Summarisation
Supervised Learning
Support Vector Classifier (SVC)
Support Vector Machines
Support Vector Regression
SVM_Example.py
Symbolic computation
Sympy
syntactic relationships
t-SNE
Tableau
Technical Analysis of Named Entity Recognition
Technical Debt
Technical Design Doc Template
Telecommunications
Tensorflow
Terminal commands
Test Loss When Evaluating Models
Testing
Testing_Pytest.py
Testing_unittest.py
Text2Cypher
TF-IDF
The Data Hierarchy of Needs
Thinking Systems
Time Series
Time Series Forecasting
Time Series Identify Trends and Patterns
Tokenisation
TOML
tool.bandit
tool.ruff
tool.uv
Train-Dev-Test Sets
Transaction
Transfer Learning
transfer_learning.py
Transformed Target Regressor
Transformer
Transformers vs RNNs
Types of Computational Bugs
Types of Database Schema
Types of Neural Networks
TypeScript
Typical Output Formats in Neural Networks
Ubuntu
UML
unittest
unstructured data
Unsupervised learning
Untitled
Use Cases for a Simple Neural Network Like
Use of RNNs in energy sector
Using SQLite to Process and Split Combined Data from Excel
vanishing and exploding gradients problem
variance
Vector Database
Vector Embedding
Vector_Embedding.py
Vectorisation
Vectorized Engine
View Use Case
Views
Violin plot
Virtual environments
WCSS and elbow method
Weak Learners
Web Feature Server (WFS)
Web Map Tile Service (WMTS)
What algorithms or models are used within the energy sector
What algorithms or models are used within the telecommunication sector
What are Data Processing Techniques (row-based, columnar, vectorized)?
What are the best practices for evaluating the effectiveness of different prompts
What are the top Cloud Providers?
What can ABM solve within the energy sector
What is a Data Lake?
What is a Data Lakehouse?
What is a Data Product?
What is a Data Warehouse?
What is a Jinja Template?
What is a Lambda Architecture?
What is a Metric?
What is a policy in RL
What is a Push-Down?
What is a Soft Delete?
What is a Storage Layer / Object Store?
What is an In-Memory Format?
What is Apache Airflow?
What is Apache Spark?
What is Business Intelligence
What is Dagster?
What is Data Governance?
What is Data Integration?
What is Data Lineage?
What is Data Literacy?
What is Data Observability?
What is Data Quality?
What is data transformation?
What is declarative?
What is DevOps?
What is ETL?
What is Functional Programming?
What is Granularity
What is imperative?
What is Kubernetes?
What is Machine Learning?
What is MapReduce?
What is Master Data Management (MDM)?
What is Normalization?
What is OLAP (Online Analytical Processing)?
What is Reverse ETL?
What is Schema Evolution?
What is semi-structured data?
What is Slowly Changing Dimension?
What is SQL?
What is structured data?
What is the Big-O Notation?
What is the difference between odds and probability
What is the role of gradient-based optimization in training deep learning models.
What is YAML?
When and why not to us regularisation
Why and when is feature scaling necessary
Why does increasing the number of models in a ensemble not necessarily improve the accuracy
Why does the Adam Optimizer converge
Why is named entity recognition (NER) a challenging task
Why is the Central Limit Theorem important when working with small sample sizes
Why JSON is Better than Pickle for Untrusted Data
Why Type 1 and Type 2 matter
Why Use Views
Wikipedia_API.py
Windows Subsystem for Linux
Word2vec
Word2Vec.py
Wrapper Methods
XGBoost
Z-Normalisation
Z-NormalisationZ-Score
Home
❯
standardised
Folder: standardised
678 items under this folder.
17 Feb 2025
fact table
17 Feb 2025
facts
17 Feb 2025
filter methods
statistics
17 Feb 2025
What is Functional Programming?
software
17 Feb 2025
gaussian_mixture_model_implementation.py
17 Feb 2025
gitlab-ci.yml
17 Feb 2025
What is Granularity
database
data_modeling
17 Feb 2025
heterogeneous features
data_cleaning
17 Feb 2025
how do you do the data selection
17 Feb 2025
What is imperative?
data_orchestration
17 Feb 2025
What is an In-Memory Format?
data_storage
17 Feb 2025
incremental synchronization
17 Feb 2025
inference versus prediction
17 Feb 2025
inference
17 Feb 2025
information theory
math
17 Feb 2025
Interpretability
drafting
model_explainability
17 Feb 2025
interview notepad
career
17 Feb 2025
ipynb
software
17 Feb 2025
What is a Jinja Template?
software
17 Feb 2025
What is Kubernetes?
data_orchestration
software
17 Feb 2025
What is a Lambda Architecture?
data_modeling
data_orchestration
17 Feb 2025
learning rate
ml_optimisation
17 Feb 2025
lemmatization
NLP
17 Feb 2025
loss function
deep_learning
model_architecture
ml_optimisation
17 Feb 2025
What is MapReduce?
data_cleaning
17 Feb 2025
What is Master Data Management (MDM)?
data_storage
data_governance
data_management
17 Feb 2025
mean absolute error
17 Feb 2025
melt
data_transformation
17 Feb 2025
What is a Metric?
business
17 Feb 2025
nltk
17 Feb 2025
npy Files A NumPy Array storage
software
17 Feb 2025
oltp (online transactional processing)
17 Feb 2025
p values
statistics
17 Feb 2025
p-values in linear regression in sklearn
17 Feb 2025
parametric vs non-parametric models
17 Feb 2025
parametric vs non-parametric tests
17 Feb 2025
parsimonious
17 Feb 2025
pdoc
17 Feb 2025
programming languages
17 Feb 2025
Prompting
prompt
17 Feb 2025
What is a Push-Down?
database
17 Feb 2025
What is Reverse ETL?
data_transformation
17 Feb 2025
rollup
database
17 Feb 2025
What is Schema Evolution?
database
17 Feb 2025
semantic layer
database
data_storage
17 Feb 2025
What is semi-structured data?
data_modeling
data_storage
17 Feb 2025
shapefile
software
17 Feb 2025
sklearn datasets
17 Feb 2025
What is a Storage Layer / Object Store?
data_storage
17 Feb 2025
What is structured data?
data_modeling
data_storage
17 Feb 2025
syntactic relationships
17 Feb 2025
t-SNE
data_visualization
drafting
17 Feb 2025
tool.bandit
17 Feb 2025
tool.ruff
17 Feb 2025
tool.uv
software
17 Feb 2025
transfer_learning.py
17 Feb 2025
unittest
17 Feb 2025
unstructured data
data_modeling
data_storage
17 Feb 2025
vanishing and exploding gradients problem
drafting
17 Feb 2025
variance
17 Feb 2025
What is YAML?
software
17 Feb 2025
Tableau
data_visualization
17 Feb 2025
Technical Debt
software
17 Feb 2025
Technical Design Doc Template
17 Feb 2025
Telecommunications
17 Feb 2025
Tensorflow
deep_learning
software
17 Feb 2025
Terminal commands
software
17 Feb 2025
Test Loss When Evaluating Models
17 Feb 2025
Testing
17 Feb 2025
Testing_Pytest.py
17 Feb 2025
Testing_unittest.py
17 Feb 2025
Text2Cypher
17 Feb 2025
Thinking Systems
career
drafting
17 Feb 2025
Time Series Forecasting
17 Feb 2025
Time Series Identify Trends and Patterns
17 Feb 2025
Time Series
17 Feb 2025
Tokenisation
NLP
code_snippet
17 Feb 2025
Train-Dev-Test Sets
17 Feb 2025
Transaction
17 Feb 2025
Transfer Learning
model_algorithm
17 Feb 2025
Transformed Target Regressor
17 Feb 2025
Transformer
deep_learning
NLP
17 Feb 2025
Transformers vs RNNs
deep_learning
17 Feb 2025
TypeScript
software
17 Feb 2025
Types of Computational Bugs
17 Feb 2025
Types of Database Schema
17 Feb 2025
Types of Neural Networks
17 Feb 2025
Typical Output Formats in Neural Networks
17 Feb 2025
UML
data_modeling
17 Feb 2025
Ubuntu
17 Feb 2025
Unsupervised learning
clustering
field
17 Feb 2025
Untitled
17 Feb 2025
Use Cases for a Simple Neural Network Like
17 Feb 2025
Use of RNNs in energy sector
time_series
deep_learning
energy
anomaly_detection
17 Feb 2025
Using SQLite to Process and Split Combined Data from Excel
database
17 Feb 2025
Vector Database
17 Feb 2025
Vector Embedding
math
language_models
drafting
17 Feb 2025
Vector_Embedding.py
17 Feb 2025
Vectorisation
software
17 Feb 2025
Vectorized Engine
17 Feb 2025
View Use Case
17 Feb 2025
Views
database
17 Feb 2025
Violin plot
statistics
17 Feb 2025
Virtual environments
software
17 Feb 2025
WCSS and elbow method
clustering
17 Feb 2025
Weak Learners
17 Feb 2025
Web Feature Server (WFS)
17 Feb 2025
Web Map Tile Service (WMTS)
17 Feb 2025
What algorithms or models are used within the energy sector
question
energy
17 Feb 2025
What algorithms or models are used within the telecommunication sector
17 Feb 2025
What are the best practices for evaluating the effectiveness of different prompts
17 Feb 2025
What can ABM solve within the energy sector
question
17 Feb 2025
What is the difference between odds and probability
question
math
17 Feb 2025
What is the role of gradient-based optimization in training deep learning models.
question
17 Feb 2025
When and why not to us regularisation
17 Feb 2025
Why JSON is Better than Pickle for Untrusted Data
17 Feb 2025
Why Type 1 and Type 2 matter
17 Feb 2025
Why Use Views
17 Feb 2025
Why and when is feature scaling necessary
17 Feb 2025
Why does increasing the number of models in a ensemble not necessarily improve the accuracy
17 Feb 2025
Why does the Adam Optimizer converge
17 Feb 2025
Why is named entity recognition (NER) a challenging task
17 Feb 2025
Why is the Central Limit Theorem important when working with small sample sizes
17 Feb 2025
Wikipedia_API.py
17 Feb 2025
Windows Subsystem for Linux
17 Feb 2025
Word2Vec.py
17 Feb 2025
Word2vec
17 Feb 2025
Wrapper Methods
17 Feb 2025
XGBoost
ml_optimisation
17 Feb 2025
Z-Normalisation
17 Feb 2025
Z-NormalisationZ-Score
17 Feb 2025
What is the Big-O Notation?
math
17 Feb 2025
binary classification
17 Feb 2025
What is Business Intelligence
business
17 Feb 2025
cleaning terminal path
software
17 Feb 2025
What is Dagster?
data_orchestration
17 Feb 2025
data asset
17 Feb 2025
What is Data Governance?
business
data_governance
17 Feb 2025
The Data Hierarchy of Needs
data_management
17 Feb 2025
What is Data Integration?
data_storage
data_orchestration
17 Feb 2025
What is Data Lineage?
data_management
17 Feb 2025
What is Data Literacy?
business
17 Feb 2025
What is a Data Product?
data_management
business
17 Feb 2025
What is Data Quality?
data_quality
17 Feb 2025
data virtualization
17 Feb 2025
dbt
software
17 Feb 2025
What is declarative?
data_orchestration
field
17 Feb 2025
dimensions
data_modeling
17 Feb 2025
duckdb
17 Feb 2025
emergent behavior
17 Feb 2025
ETL vs. ELT
data_transformation
17 Feb 2025
etlt
data_transformation
17 Feb 2025
Pytest
17 Feb 2025
Python
software
17 Feb 2025
Pytorch vs Tensorflow
17 Feb 2025
Q-Learning
regressor
ml_process
17 Feb 2025
QUERY GSheets
17 Feb 2025
Quartz
software
17 Feb 2025
Query Optimisation
database
17 Feb 2025
Querying
database
17 Feb 2025
R squared
statistics
17 Feb 2025
R-squared metric not always a good indicator of model performance in regression
17 Feb 2025
RAG
17 Feb 2025
REST API
17 Feb 2025
ROC (Receiver Operating Characteristic)
evaluation
17 Feb 2025
ROC_Curve.py
17 Feb 2025
Random Forest Regression
17 Feb 2025
Random Forests
classifier
drafting
17 Feb 2025
React
17 Feb 2025
Reasoning tokens
17 Feb 2025
Recall
evaluation
17 Feb 2025
Recommender systems
evaluation
model_algorithm
17 Feb 2025
Recurrent Neural Networks
deep_learning
time_series
17 Feb 2025
Regression metrics
code_snippet
evaluation
17 Feb 2025
Regression Analysis and its Applications
statistics
regressor
17 Feb 2025
Regularisation of Tree based models
ml_process
ml_optimisation
evaluation
model_explainability
17 Feb 2025
Regularization in Machine Learning
deleted
ml_process
data_visualization
statistics
ml_optimisation
model_explainability
17 Feb 2025
Regularisation.py
17 Feb 2025
Reinforcement learning
field
reinforcement_learning
17 Feb 2025
Relating Tables Together
database
17 Feb 2025
Relationships in memory
memory_management
language_models
17 Feb 2025
Reward Function
17 Feb 2025
Ridge
drafting
17 Feb 2025
Row-based Storage
17 Feb 2025
SHapley Additive exPlanations
17 Feb 2025
SMOTE (Synthetic Minority Over-sampling Technique)
17 Feb 2025
SMSS
17 Feb 2025
SQL vs NoSQL
question
software
17 Feb 2025
What is SQL?
software
database
17 Feb 2025
SQLite
17 Feb 2025
SVM_Example.py
17 Feb 2025
Sarsa
17 Feb 2025
Scala
software
17 Feb 2025
Scalability
17 Feb 2025
Scaling Agentic Systems
17 Feb 2025
Scaling Server
17 Feb 2025
Scaling databases
database
17 Feb 2025
Scientific Method
field
drafting
17 Feb 2025
Search
17 Feb 2025
Security
17 Feb 2025
Semantic Relationships
17 Feb 2025
Sentence Similarity
17 Feb 2025
Sharepoint
software
17 Feb 2025
Silhouette Analysis
17 Feb 2025
Single source of truth
17 Feb 2025
Sklearn Pipiline
code_snippet
data_transformation
17 Feb 2025
Sklearn
data_cleaning
17 Feb 2025
What is Slowly Changing Dimension?
database
17 Feb 2025
Small Language Models
NLP
language_models
17 Feb 2025
Smart Grids
energy
17 Feb 2025
Snowflake Schema
17 Feb 2025
Snowflake
17 Feb 2025
What is a Soft Delete?
database
17 Feb 2025
Software Development Life Cycle
data_orchestration
17 Feb 2025
Software Development
portal
17 Feb 2025
SparseCategorialCrossentropy or CategoricalCrossEntropy
17 Feb 2025
Specificity
evaluation
17 Feb 2025
Spreadsheets to Databases
17 Feb 2025
Stack
data_transformation
17 Feb 2025
Stacking
17 Feb 2025
Standard deviation
17 Feb 2025
Standardisation
17 Feb 2025
Star Schema
17 Feb 2025
Statistical Assumptions
17 Feb 2025
Statistical Tests
17 Feb 2025
Statistics
statistics
drafting
17 Feb 2025
Stemming
17 Feb 2025
Stochastic Gradient Descent
17 Feb 2025
Strongly vs Weakly typed language
17 Feb 2025
Summarisation
NLP
17 Feb 2025
Supervised Learning
field
17 Feb 2025
Support Vector Classifier (SVC)
17 Feb 2025
Support Vector Machines
classifier
clustering
17 Feb 2025
Support Vector Regression
17 Feb 2025
Symbolic computation
17 Feb 2025
Sympy
17 Feb 2025
TF-IDF
NLP
17 Feb 2025
TOML
17 Feb 2025
Model Building
ml_optimisation-evaluation
17 Feb 2025
Model Cascading
17 Feb 2025
Model Deployment
deleted
model_architecture
17 Feb 2025
Model Ensemble
deleted
model_architecture
17 Feb 2025
Model Evaluation vs Model Optimisation
17 Feb 2025
Model Evaluation
evaluation
deleted
17 Feb 2025
Model Interpretability
17 Feb 2025
Model Observability
deleted
model_explainability
17 Feb 2025
Model Optimisation
drafting
17 Feb 2025
Model Parameters Tuning
ml_optimisation
model_selection
17 Feb 2025
Model Parameters
17 Feb 2025
Model Selection
ml_process
deleted
evaluation
17 Feb 2025
Model Validation
17 Feb 2025
Model parameters vs hyperparameters
17 Feb 2025
Model preparation
17 Feb 2025
Momentum
17 Feb 2025
Momentum.py
17 Feb 2025
MongoDB
17 Feb 2025
Monolith Architecture
software_architecture
17 Feb 2025
Multi-Agent Reinforcement Learning
question
17 Feb 2025
Multi-head attention
deleted
deep_learning
17 Feb 2025
Multicollinearity
code_snippet
statistics
17 Feb 2025
Multinomial Naive bayes
17 Feb 2025
MySql
17 Feb 2025
Natural Language Processing
NLP
17 Feb 2025
Naive Bayes
classifier
17 Feb 2025
Technical Analysis of Named Entity Recognition
NLP
model_algorithm
17 Feb 2025
Network Design
energy
17 Feb 2025
Neural Network Classification
17 Feb 2025
Neural Scaling Laws
drafting
17 Feb 2025
Neural network in Practice
17 Feb 2025
Neural network
deep_learning
drafting
17 Feb 2025
Ngrams
17 Feb 2025
NoSQL
17 Feb 2025
Non-parametric tests
17 Feb 2025
Normalisation of Text
NLP
code_snippet
17 Feb 2025
Normalisation of data
17 Feb 2025
Normalisation vs Standardisation
17 Feb 2025
Normalisation
portal
17 Feb 2025
What is Normalization?
database
17 Feb 2025
NotebookLM
portal
17 Feb 2025
What is OLAP (Online Analytical Processing)?
database
data_cleaning
17 Feb 2025
OLTP
17 Feb 2025
One Pager Template
17 Feb 2025
One-hot encoding
17 Feb 2025
Optimisation function
ml_optimisation
model_selection
17 Feb 2025
Optimisation techniques
17 Feb 2025
Optimising Neural Networks
17 Feb 2025
Optimising a Logistic Regression Model
17 Feb 2025
Optuna
17 Feb 2025
Ordinary Least Squares
17 Feb 2025
Orthogonalization
17 Feb 2025
Outliers
statistics
anomaly_detection
data_cleaning
17 Feb 2025
Overfitting in Machine Learning
model_architecture
17 Feb 2025
PCA Explained Variance Ratio
17 Feb 2025
PDP and ICE
17 Feb 2025
Pandas
data_transformation
17 Feb 2025
Parametric tests
17 Feb 2025
Parquet
data_storage
17 Feb 2025
Part of speech tagging
17 Feb 2025
Performance Drift in Machine Learning
deleted
data_quality
model_explainability
17 Feb 2025
Physical Model
17 Feb 2025
What is a policy in RL
question
17 Feb 2025
PostgresSQL
17 Feb 2025
PowerBI
software
data_visualization
17 Feb 2025
PowerShell
software
17 Feb 2025
Powerquery
software
17 Feb 2025
Powershell versus cmd
software
17 Feb 2025
Powershell vs Bash
17 Feb 2025
Precision or Recall
evaluation
17 Feb 2025
Precision-Recall Curve
17 Feb 2025
Precision
evaluation
17 Feb 2025
Preprocessing
ml_optimisation
data_transformation
data_cleaning
data_collection
portal
17 Feb 2025
Primary Key
17 Feb 2025
Principal Component Analysis
data_cleaning
data_visualization
17 Feb 2025
Probability in other fields
17 Feb 2025
Problem Definition
17 Feb 2025
Prompt Extracting information from blog posts
prompt
17 Feb 2025
Prompt engineering
language_models
NLP
17 Feb 2025
Publish and Subscribe
17 Feb 2025
Pull Request Template
17 Feb 2025
PyCaret
17 Feb 2025
PySpark
data_orchestration
software
17 Feb 2025
PyTorch
software
17 Feb 2025
Pycaret_Example.py
17 Feb 2025
Pydantic
17 Feb 2025
Pydantic.py
17 Feb 2025
Pydantic_More.py
17 Feb 2025
Pyright vs Pydantic
17 Feb 2025
Pyright
prompt
17 Feb 2025
Handling Different Distributions
17 Feb 2025
Handling Missing Data
17 Feb 2025
Handwritten Digit Classification
17 Feb 2025
Hashes
17 Feb 2025
Heatmap
code_snippet
data_visualization
17 Feb 2025
Hierarchical Clustering
17 Feb 2025
High cross validation accuracy is not directly proportional to performance on unseen test data
17 Feb 2025
How businesses use Gen AI
business
GenAI
deleted
17 Feb 2025
How do we evaluate of LLM Outputs
17 Feb 2025
How is reinforcement learning being combined with deep learning
question
17 Feb 2025
How is schema evolution done in practice with SQL
question
17 Feb 2025
How to do git commit messages properly
17 Feb 2025
How to model to improve demand forecasting
question
17 Feb 2025
How to normalise a merged table
17 Feb 2025
How to reduce the need for Gen AI responses
GenAI
business
17 Feb 2025
How to search within a graph
17 Feb 2025
How to use Sklearn Pipeline
question
17 Feb 2025
Hugging Face
software
17 Feb 2025
Hyperparameter Tuning
17 Feb 2025
Hyperparameter
drafting
17 Feb 2025
Hypothesis testing
statistics
17 Feb 2025
Imbalanced Datasets
data_quality
data_cleaning
data_exploration
17 Feb 2025
Imbalanced_Datasets_SMOTE.py
17 Feb 2025
Implementing Database Schema
17 Feb 2025
In NER how would you handle ambiguous entities
17 Feb 2025
Industries of interest
career
17 Feb 2025
Input is Not Properly Sanitized
17 Feb 2025
Isolation Forest and Its Use in Anomaly Detection
anomaly_detection
data_quality
17 Feb 2025
Java vs JavaScript
software
17 Feb 2025
Json to Yaml
17 Feb 2025
Json
17 Feb 2025
Junction Tables
17 Feb 2025
Justfile
17 Feb 2025
K-means
clustering
17 Feb 2025
K-nearest neighbours
classifier
17 Feb 2025
K_Means.py
17 Feb 2025
Kaggle Abalone regression example
17 Feb 2025
Kernelling
17 Feb 2025
Key Differences of Web Feature Server (WFS) and Web Feature Server (WFS)
17 Feb 2025
Kmeans vs GMM
17 Feb 2025
Knowledge Graph
17 Feb 2025
Knowledge Graphs with Obsidian
17 Feb 2025
Knowledge Work
career
17 Feb 2025
Knowledge graph vs RAG setup
17 Feb 2025
LBFGS
17 Feb 2025
LLM Evaluation Metrics
17 Feb 2025
LLM
language_models
17 Feb 2025
LSTM
deep_learning
time_series
code_snippet
drafting
17 Feb 2025
Labelling data
17 Feb 2025
Language Model Output Optimisation
17 Feb 2025
Language Models Large (LLMs) vs Small (SLMs)
17 Feb 2025
Language Models
17 Feb 2025
Lasso
drafting
17 Feb 2025
Latency
17 Feb 2025
Learning Styles
model_architecture
17 Feb 2025
LightGBM vs XGBoost vs CatBoost
17 Feb 2025
LightGBM
ml_optimisation
17 Feb 2025
Linear Discriminant Analysis
17 Feb 2025
Linear Regression
regressor
17 Feb 2025
Load Balancing
17 Feb 2025
Local Interpretable Model-agnostic Explanations
17 Feb 2025
Logical Model
17 Feb 2025
Logistic Regression
classifier
regressor
17 Feb 2025
Logistic regression in sklearn & Gradient Descent
17 Feb 2025
Loss versus Cost function
17 Feb 2025
ML Engineer
17 Feb 2025
MNIST
17 Feb 2025
Machine Learning Algorithms
ml_process
model_algorithm
17 Feb 2025
Machine Learning Operations
drafting
17 Feb 2025
Machine Learning Workflow
portal
17 Feb 2025
What is Machine Learning?
field
17 Feb 2025
Maintainable Code
17 Feb 2025
Makefile
17 Feb 2025
Manifold learning
deleted
data_exploration
17 Feb 2025
Many-to-Many Relationships
17 Feb 2025
Markov Decision Processes
model_algorithm
17 Feb 2025
Markov chain
17 Feb 2025
Mathematical Reasoning in Transformers
question
17 Feb 2025
Mean Squared Error
17 Feb 2025
Memory Caching
17 Feb 2025
Memory
17 Feb 2025
Mermaid
data_modeling
17 Feb 2025
Metadata Handling
17 Feb 2025
Methods for Handling Outliers
17 Feb 2025
Microsoft Access
software
database
17 Feb 2025
Mini-batch gradient descent
17 Feb 2025
Mixture of Experts
17 Feb 2025
Docker
17 Feb 2025
Documentation
17 Feb 2025
Dropout
deep_learning
ml_optimisation
17 Feb 2025
EDA
data_exploration
data_transformation
17 Feb 2025
ELT
data_transformation
17 Feb 2025
ER Diagrams
data_quality
database
17 Feb 2025
ETL Pipeline example
data_transformation
17 Feb 2025
What is ETL?
data_transformation
17 Feb 2025
Edge Machine Learning Models
17 Feb 2025
Education and Training
17 Feb 2025
Elastic Net
code_snippet
17 Feb 2025
Embedded Methods
17 Feb 2025
Encoding Categorical Variables
code_snippet
regressor
data_cleaning
17 Feb 2025
Energy ABM
17 Feb 2025
Energy Storage
energy
17 Feb 2025
Energy
energy
17 Feb 2025
Epoch
17 Feb 2025
Estimator
17 Feb 2025
Evaluating Language Models
evaluation
language_models
17 Feb 2025
Evaluation Metrics
code_snippet
evaluation
17 Feb 2025
Event Driven Events
17 Feb 2025
Event Driven Microservices
17 Feb 2025
Event Driven
17 Feb 2025
Event-Driven Architecture
17 Feb 2025
Everything
software
17 Feb 2025
Excel & Sheets
software
business
17 Feb 2025
Explain different gradient descent algorithms, their advantages, and limitations.
question
17 Feb 2025
Explain the curse of dimensionality
data_cleaning
17 Feb 2025
Exploration vs. Exploitation
17 Feb 2025
Exploration
drafting
17 Feb 2025
Exponential Smoothing
17 Feb 2025
F1 Score
17 Feb 2025
Fabric
software
17 Feb 2025
Factor Analysis
17 Feb 2025
Factor_Analysis.py
17 Feb 2025
FastAPI
17 Feb 2025
FastAPI_Example.py
17 Feb 2025
Feature Engineering
ml_process
ml_optimisation
17 Feb 2025
Feature Evaluation
17 Feb 2025
Feature Extraction
data_transformation
17 Feb 2025
Feature Importance
ml_process
evaluation
model_explainability
17 Feb 2025
Feature Scaling
data_cleaning
data_processing
17 Feb 2025
Feature Selection vs Feature Importance
17 Feb 2025
Feature Selection
ml_process
drafting
17 Feb 2025
Feature selection and creation
17 Feb 2025
Feed Forward Neural Network
deep_learning
classifier
17 Feb 2025
Feedback Template
17 Feb 2025
Filter method
17 Feb 2025
Firebase
17 Feb 2025
Fitting weights and biases of a neural network
17 Feb 2025
Flask
software
17 Feb 2025
Folder Tree Diagram
software
17 Feb 2025
Foreign Key
17 Feb 2025
Forward Propagation in Neural Networks
deep_learning
statistics
17 Feb 2025
Full Lifecycle Management
data_management
17 Feb 2025
GIS
17 Feb 2025
GRU
17 Feb 2025
GSheets
17 Feb 2025
Gaussian Distribution
17 Feb 2025
Gaussian Mixture Models
clustering
17 Feb 2025
Gaussian Model
17 Feb 2025
Generative AI From Theory to Practice
17 Feb 2025
Generative AI
17 Feb 2025
Generative Adversarial Networks
17 Feb 2025
Get data
data_collection
17 Feb 2025
Gini Impurity vs Cross Entropy
17 Feb 2025
Gini Impurity
17 Feb 2025
Git
17 Feb 2025
Gitlab
17 Feb 2025
Google Cloud Platform
17 Feb 2025
Google My Maps Data Extraction
17 Feb 2025
Gradient Boosting Regressor
regressor
17 Feb 2025
Gradient Boosting
ml_optimisation
17 Feb 2025
Gradient Descent
ml_optimisation
17 Feb 2025
Gradio
17 Feb 2025
Grain
17 Feb 2025
Grammar method
17 Feb 2025
Graph Analysis Plugin
17 Feb 2025
GraphRAG
drafting
17 Feb 2025
Grep
17 Feb 2025
GridSeachCv
17 Feb 2025
Groupby vs Crosstab
17 Feb 2025
Groupby
data_transformation
17 Feb 2025
Guardrails
GenAI
business
17 Feb 2025
Hadoop
software
17 Feb 2025
Continuous Integration
17 Feb 2025
Converting categorical variables to a dummy indicators
17 Feb 2025
Convolutional Neural Networks
17 Feb 2025
Correlation vs Causation
17 Feb 2025
Correlation
statistics
17 Feb 2025
Cosine Similarity
17 Feb 2025
Cost Function
17 Feb 2025
Cost-Sensitive Analysis
evaluation
17 Feb 2025
Covariance Structures
17 Feb 2025
Covariance vs Correlation
17 Feb 2025
Covariance
statistics
data_analysis
17 Feb 2025
Cross Entropy
model_architecture
ml_optimisation
17 Feb 2025
Cross validation
evaluation
17 Feb 2025
Cross_Entropy.py
17 Feb 2025
Cross_Entropy_Single.py
17 Feb 2025
Current challenges within the energy sector
question
17 Feb 2025
DBScan
clustering
17 Feb 2025
Dash
17 Feb 2025
Dashboarding
17 Feb 2025
Data AI Education at Work
business
17 Feb 2025
Data Analysis
17 Feb 2025
Data Analyst
17 Feb 2025
Data Architect
17 Feb 2025
Data Archive Graph Analysis
17 Feb 2025
Data Cleansing
data_transformation
data_cleaning
portal
17 Feb 2025
Data Collection
17 Feb 2025
Data Drift
17 Feb 2025
Data Engineer
career
17 Feb 2025
Data Engineering Portal
portal
17 Feb 2025
Data Engineering
field
17 Feb 2025
Data Ingestion
17 Feb 2025
Data Integrity
17 Feb 2025
What is a Data Lake?
data_storage
17 Feb 2025
What is a Data Lakehouse?
data_storage
17 Feb 2025
Data Leakage
17 Feb 2025
Data Management
data_management
17 Feb 2025
Data Mining - CRISP
business
17 Feb 2025
Data Modelling
data_modeling
17 Feb 2025
What is Data Observability?
data_orchestration
data_management
17 Feb 2025
Data Orchestration
data_orchestration
17 Feb 2025
Data Pipeline to Data Products
question
data_orchestration
anomaly_detection
data_pipeline
data_products
17 Feb 2025
Data Pipeline
data_pipeline
17 Feb 2025
Data Principles
data_quality
data_governance
17 Feb 2025
Data Reduction
17 Feb 2025
Data Roles
17 Feb 2025
Data Science
field
portal
17 Feb 2025
Data Scientist
17 Feb 2025
Data Selection
data_transformation
17 Feb 2025
Data Steward
17 Feb 2025
Data Streaming
data_orchestration
17 Feb 2025
What is data transformation?
data_cleaning
data_transformation
17 Feb 2025
Data Validation
17 Feb 2025
Data Visualisation
17 Feb 2025
What is a Data Warehouse?
database
data_storage
17 Feb 2025
Data storage
database
data_storage
17 Feb 2025
Database Index
database_optimisation
17 Feb 2025
Database Management System (DBMS)
17 Feb 2025
What are Data Processing Techniques (row-based, columnar, vectorized)?
database
data_cleaning
17 Feb 2025
Database schema
drafting
database
17 Feb 2025
Database
database
17 Feb 2025
Databricks vs Snowflake
software
data_storage
17 Feb 2025
Databricks
software
17 Feb 2025
Datasets
17 Feb 2025
Debugging ipynb
17 Feb 2025
Debugging
data_exploration
17 Feb 2025
Debugging.py
17 Feb 2025
Decision Tree
classifier
regressor
17 Feb 2025
Deep Learning Frameworks
deep_learning-drafting
17 Feb 2025
Deep Learning Overview
deep_learning
17 Feb 2025
Deep Q-Learning
17 Feb 2025
DeepSeek
drafting
17 Feb 2025
Demand forecasting
question
energy
17 Feb 2025
Dendrograms
17 Feb 2025
Determining Threshold Values
17 Feb 2025
What is DevOps?
data_orchestration
17 Feb 2025
Difference between Databricks vs. Snowflake
17 Feb 2025
Difference between snowflake to hadoop
software_architecture
data_storage
17 Feb 2025
Digital Transformation
business
17 Feb 2025
Digital twin
data_modeling
17 Feb 2025
Dimension Table
17 Feb 2025
Dimensional Modelling
data_modeling
17 Feb 2025
Dimensionality Reduction
ml_process
data_visualization
17 Feb 2025
Directed Acyclic Graph (DAG)
math
data_orchestration
17 Feb 2025
Directory Structure
software
17 Feb 2025
Distillation
17 Feb 2025
Distributed Computing
data_management
data_processing
17 Feb 2025
Distributions
statistics
drafting
17 Feb 2025
Docker Image
17 Feb 2025
1-on-1 Template
17 Feb 2025
AB testing
17 Feb 2025
ACID Transaction
database
data_storage
17 Feb 2025
AI Engineer
17 Feb 2025
AI governance
17 Feb 2025
API Driven Microservices
software
business
17 Feb 2025
API
software
17 Feb 2025
ARIMA
17 Feb 2025
AUC
evaluation
17 Feb 2025
AWS Lambda
17 Feb 2025
Accessing Gen AI generated content
GenAI
evaluation
17 Feb 2025
Accuracy
evaluation
17 Feb 2025
Activation Function
deep_learning
17 Feb 2025
Activation atlases
17 Feb 2025
Active Learning
classifier
17 Feb 2025
Ada boosting
model_architecture
17 Feb 2025
Adam Optimizer
17 Feb 2025
Adaptive Learning Rates
17 Feb 2025
Addressing Multicollinearity
17 Feb 2025
Addressing_Multicollinearity.py
17 Feb 2025
Adjusted R squared
statistics
evaluation
17 Feb 2025
Agent-based modelling
17 Feb 2025
Agentic Solutions
drafting
17 Feb 2025
Algorithms
17 Feb 2025
Amazon S3
17 Feb 2025
Anomaly Detection in Time Series
17 Feb 2025
Anomaly Detection with Clustering
17 Feb 2025
Anomaly Detection with Statistical Methods
anomaly_detection
statistics
ml
17 Feb 2025
Anomaly Detection
17 Feb 2025
What is Apache Airflow?
data_orchestration
software
17 Feb 2025
Apache Kafka
software
data_orchestration
17 Feb 2025
What is Apache Spark?
software
17 Feb 2025
Attention Is All You Need
17 Feb 2025
Attention mechanism
language_models
17 Feb 2025
Automated Feature Creation
17 Feb 2025
Azure
software
data_storage
17 Feb 2025
BERT Pretraining of Deep Bidirectional Transformers for Language Understanding
17 Feb 2025
BERT
NLP
language_models
17 Feb 2025
BERTScore
17 Feb 2025
Backpropagation in Neural Networks
deep_learning
ml_optimisation
statistics
17 Feb 2025
Bag of words
NLP
17 Feb 2025
Bagging
model_architecture
17 Feb 2025
Bandit example output
17 Feb 2025
Bandit_Example_Fixed.py
17 Feb 2025
Bandit_Example_Nonfixed.py
17 Feb 2025
Baseline Forecasting
17 Feb 2025
Bash
17 Feb 2025
Batch Normalistion
17 Feb 2025
Batch Processing
data_orchestration
17 Feb 2025
Bellman Equations
question
17 Feb 2025
Bias and variance
model_architecture
model_explainability
17 Feb 2025
Big Data
data_storage
17 Feb 2025
BigQuery
17 Feb 2025
Boosting
model_architecture
model_explainability
17 Feb 2025
Boxplot
statistics
data_cleaning
data_visualization
17 Feb 2025
Business observability
business
17 Feb 2025
CI-CD
17 Feb 2025
CRUD
17 Feb 2025
Career Interest
portal
17 Feb 2025
CatBoost
17 Feb 2025
Central Limit Theorem
statistics
17 Feb 2025
Chain of thought
17 Feb 2025
Change Management
business
17 Feb 2025
Checksum
17 Feb 2025
Choosing a Threshold
17 Feb 2025
Choosing the Number of Clusters
17 Feb 2025
Class Separability
17 Feb 2025
Classification Report
17 Feb 2025
Classification
classifier
17 Feb 2025
Claude
17 Feb 2025
What are the top Cloud Providers?
data_storage
17 Feb 2025
Clustering
clustering
17 Feb 2025
Clustering_Dashboard.py
code_snippet
17 Feb 2025
Clustermap
17 Feb 2025
Code Diagrams
17 Feb 2025
Columnar Storage
17 Feb 2025
Command Prompt
software
17 Feb 2025
Command line
17 Feb 2025
Common Security Vulnerabilities in Software Development
software
17 Feb 2025
Common Table Expression
database
17 Feb 2025
Communication Techniques
communication
17 Feb 2025
Communication principles
communication
17 Feb 2025
Comparing LLM
17 Feb 2025
Comparing_Ensembles.py
17 Feb 2025
Components of the database
17 Feb 2025
Computer Science
17 Feb 2025
Conceptual Model
17 Feb 2025
Confidence Interval
statistics
17 Feb 2025
Confusion Matrix
evaluation
17 Feb 2025
Continuous Delivery - Deployment
Explorer
pages
Data Archive
DE_Tools
ML_Tools
Queries
Questions
Quotes
Tags
standardised
1-on-1 Template
AB testing
Accessing Gen AI generated content
Accuracy
ACID Transaction
Activation atlases
Activation Function
Active Learning
Ada boosting
Adam Optimizer
Adaptive Learning Rates
Addressing Multicollinearity
Addressing_Multicollinearity.py
Adjusted R squared
Agent-based modelling
Agentic Solutions
AI Engineer
AI governance
Algorithms
Amazon S3
Anomaly Detection
Anomaly Detection in Time Series
Anomaly Detection with Clustering
Anomaly Detection with Statistical Methods
Apache Kafka
API
API Driven Microservices
ARIMA
Attention Is All You Need
Attention mechanism
AUC
Automated Feature Creation
AWS Lambda
Azure
Backpropagation in Neural Networks
Bag of words
Bagging
Bandit example output
Bandit_Example_Fixed.py
Bandit_Example_Nonfixed.py
Baseline Forecasting
Bash
Batch Normalistion
Batch Processing
Bellman Equations
BERT
BERT Pretraining of Deep Bidirectional Transformers for Language Understanding
BERTScore
Bias and variance
Big Data
BigQuery
binary classification
Boosting
Boxplot
Business observability
Career Interest
CatBoost
Central Limit Theorem
Chain of thought
Change Management
Checksum
Choosing a Threshold
Choosing the Number of Clusters
CI-CD
Class Separability
Classification
Classification Report
Claude
cleaning terminal path
Clustering
Clustering_Dashboard.py
Clustermap
Code Diagrams
Columnar Storage
Command line
Command Prompt
Common Security Vulnerabilities in Software Development
Common Table Expression
Communication principles
Communication Techniques
Comparing LLM
Comparing_Ensembles.py
Components of the database
Computer Science
Conceptual Model
Confidence Interval
Confusion Matrix
Continuous Delivery - Deployment
Continuous Integration
Converting categorical variables to a dummy indicators
Convolutional Neural Networks
Correlation
Correlation vs Causation
Cosine Similarity
Cost Function
Cost-Sensitive Analysis
Covariance
Covariance Structures
Covariance vs Correlation
Cross Entropy
Cross validation
Cross_Entropy_Single.py
Cross_Entropy.py
CRUD
Current challenges within the energy sector
Dash
Dashboarding
Data AI Education at Work
Data Analysis
Data Analyst
Data Architect
Data Archive Graph Analysis
data asset
Data Cleansing
Data Collection
Data Drift
Data Engineer
Data Engineering
Data Engineering Portal
Data Ingestion
Data Integrity
Data Leakage
Data Management
Data Mining - CRISP
Data Modelling
Data Orchestration
Data Pipeline
Data Pipeline to Data Products
Data Principles
Data Reduction
Data Roles
Data Science
Data Scientist
Data Selection
Data Steward
Data storage
Data Streaming
Data Validation
data virtualization
Data Visualisation
Database
Database Index
Database Management System (DBMS)
Database schema
Databricks
Databricks vs Snowflake
Datasets
DBScan
dbt
Debugging
Debugging ipynb
Debugging.py
Decision Tree
Deep Learning Frameworks
Deep Learning Overview
Deep Q-Learning
DeepSeek
Demand forecasting
Dendrograms
Determining Threshold Values
Difference between Databricks vs. Snowflake
Difference between snowflake to hadoop
Digital Transformation
Digital twin
Dimension Table
Dimensional Modelling
Dimensionality Reduction
dimensions
Directed Acyclic Graph (DAG)
Directory Structure
Distillation
Distributed Computing
Distributions
Docker
Docker Image
Documentation
Dropout
duckdb
EDA
Edge Machine Learning Models
Education and Training
Elastic Net
ELT
Embedded Methods
emergent behavior
Encoding Categorical Variables
Energy
Energy ABM
Energy Storage
Epoch
ER Diagrams
Estimator
ETL Pipeline example
ETL vs. ELT
etlt
Evaluating Language Models
Evaluation Metrics
Event Driven
Event Driven Events
Event Driven Microservices
Event-Driven Architecture
Everything
Excel & Sheets
Explain different gradient descent algorithms, their advantages, and limitations.
Explain the curse of dimensionality
Exploration
Exploration vs. Exploitation
Exponential Smoothing
F1 Score
Fabric
fact table
Factor Analysis
Factor_Analysis.py
facts
FastAPI
FastAPI_Example.py
Feature Engineering
Feature Evaluation
Feature Extraction
Feature Importance
Feature Scaling
Feature Selection
Feature selection and creation
Feature Selection vs Feature Importance
Feed Forward Neural Network
Feedback Template
Filter method
filter methods
Firebase
Fitting weights and biases of a neural network
Flask
Folder Tree Diagram
Foreign Key
Forward Propagation in Neural Networks
Full Lifecycle Management
Gaussian Distribution
Gaussian Mixture Models
Gaussian Model
gaussian_mixture_model_implementation.py
Generative Adversarial Networks
Generative AI
Generative AI From Theory to Practice
Get data
Gini Impurity
Gini Impurity vs Cross Entropy
GIS
Git
Gitlab
gitlab-ci.yml
Google Cloud Platform
Google My Maps Data Extraction
Gradient Boosting
Gradient Boosting Regressor
Gradient Descent
Gradio
Grain
Grammar method
Graph Analysis Plugin
GraphRAG
Grep
GridSeachCv
Groupby
Groupby vs Crosstab
GRU
GSheets
Guardrails
Hadoop
Handling Different Distributions
Handling Missing Data
Handwritten Digit Classification
Hashes
Heatmap
heterogeneous features
Hierarchical Clustering
High cross validation accuracy is not directly proportional to performance on unseen test data
How businesses use Gen AI
How do we evaluate of LLM Outputs
how do you do the data selection
How is reinforcement learning being combined with deep learning
How is schema evolution done in practice with SQL
How to do git commit messages properly
How to model to improve demand forecasting
How to normalise a merged table
How to reduce the need for Gen AI responses
How to search within a graph
How to use Sklearn Pipeline
Hugging Face
Hyperparameter
Hyperparameter Tuning
Hypothesis testing
Imbalanced Datasets
Imbalanced_Datasets_SMOTE.py
Implementing Database Schema
In NER how would you handle ambiguous entities
incremental synchronization
Industries of interest
inference
inference versus prediction
information theory
Input is Not Properly Sanitized
Interpretability
interview notepad
ipynb
Isolation Forest and Its Use in Anomaly Detection
Java vs JavaScript
Json
Json to Yaml
Junction Tables
Justfile
K_Means.py
K-means
K-nearest neighbours
Kaggle Abalone regression example
Kernelling
Key Differences of Web Feature Server (WFS) and Web Feature Server (WFS)
Kmeans vs GMM
Knowledge Graph
Knowledge graph vs RAG setup
Knowledge Graphs with Obsidian
Knowledge Work
Labelling data
Language Model Output Optimisation
Language Models
Language Models Large (LLMs) vs Small (SLMs)
Lasso
Latency
LBFGS
learning rate
Learning Styles
lemmatization
LightGBM
LightGBM vs XGBoost vs CatBoost
Linear Discriminant Analysis
Linear Regression
LLM
LLM Evaluation Metrics
Load Balancing
Local Interpretable Model-agnostic Explanations
Logical Model
Logistic Regression
Logistic regression in sklearn & Gradient Descent
loss function
Loss versus Cost function
LSTM
Machine Learning Algorithms
Machine Learning Operations
Machine Learning Workflow
Maintainable Code
Makefile
Manifold learning
Many-to-Many Relationships
Markov chain
Markov Decision Processes
Mathematical Reasoning in Transformers
mean absolute error
Mean Squared Error
melt
Memory
Memory Caching
Mermaid
Metadata Handling
Methods for Handling Outliers
Microsoft Access
Mini-batch gradient descent
Mixture of Experts
ML Engineer
MNIST
Model Building
Model Cascading
Model Deployment
Model Ensemble
Model Evaluation
Model Evaluation vs Model Optimisation
Model Interpretability
Model Observability
Model Optimisation
Model Parameters
Model Parameters Tuning
Model parameters vs hyperparameters
Model preparation
Model Selection
Model Validation
Momentum
Momentum.py
MongoDB
Monolith Architecture
Multi-Agent Reinforcement Learning
Multi-head attention
Multicollinearity
Multinomial Naive bayes
MySql
Naive Bayes
Natural Language Processing
Network Design
Neural network
Neural Network Classification
Neural network in Practice
Neural Scaling Laws
Ngrams
nltk
Non-parametric tests
Normalisation
Normalisation of data
Normalisation of Text
Normalisation vs Standardisation
NoSQL
NotebookLM
npy Files A NumPy Array storage
OLTP
oltp (online transactional processing)
One Pager Template
One-hot encoding
Optimisation function
Optimisation techniques
Optimising a Logistic Regression Model
Optimising Neural Networks
Optuna
Ordinary Least Squares
Orthogonalization
Outliers
Overfitting in Machine Learning
p values
p-values in linear regression in sklearn
Pandas
Parametric tests
parametric vs non-parametric models
parametric vs non-parametric tests
Parquet
parsimonious
Part of speech tagging
PCA Explained Variance Ratio
pdoc
PDP and ICE
Performance Drift in Machine Learning
Physical Model
PostgresSQL
PowerBI
Powerquery
PowerShell
Powershell versus cmd
Powershell vs Bash
Precision
Precision or Recall
Precision-Recall Curve
Preprocessing
Primary Key
Principal Component Analysis
Probability in other fields
Problem Definition
programming languages
Prompt engineering
Prompt Extracting information from blog posts
Prompting
Publish and Subscribe
Pull Request Template
PyCaret
Pycaret_Example.py
Pydantic
Pydantic_More.py
Pydantic.py
Pyright
Pyright vs Pydantic
PySpark
Pytest
Python
PyTorch
Pytorch vs Tensorflow
Q-Learning
Quartz
QUERY GSheets
Query Optimisation
Querying
R squared
R-squared metric not always a good indicator of model performance in regression
RAG
Random Forest Regression
Random Forests
React
Reasoning tokens
Recall
Recommender systems
Recurrent Neural Networks
Regression Analysis and its Applications
Regression metrics
Regularisation of Tree based models
Regularisation.py
Regularization in Machine Learning
Reinforcement learning
Relating Tables Together
Relationships in memory
REST API
Reward Function
Ridge
ROC (Receiver Operating Characteristic)
ROC_Curve.py
rollup
Row-based Storage
Sarsa
Scala
Scalability
Scaling Agentic Systems
Scaling databases
Scaling Server
Scientific Method
Search
Security
semantic layer
Semantic Relationships
Sentence Similarity
shapefile
SHapley Additive exPlanations
Sharepoint
Silhouette Analysis
Single source of truth
Sklearn
sklearn datasets
Sklearn Pipiline
Small Language Models
Smart Grids
SMOTE (Synthetic Minority Over-sampling Technique)
SMSS
Snowflake
Snowflake Schema
Software Development
Software Development Life Cycle
SparseCategorialCrossentropy or CategoricalCrossEntropy
Specificity
Spreadsheets to Databases
SQL vs NoSQL
SQLite
Stack
Stacking
Standard deviation
Standardisation
Star Schema
Statistical Assumptions
Statistical Tests
Statistics
Stemming
Stochastic Gradient Descent
Strongly vs Weakly typed language
Summarisation
Supervised Learning
Support Vector Classifier (SVC)
Support Vector Machines
Support Vector Regression
SVM_Example.py
Symbolic computation
Sympy
syntactic relationships
t-SNE
Tableau
Technical Analysis of Named Entity Recognition
Technical Debt
Technical Design Doc Template
Telecommunications
Tensorflow
Terminal commands
Test Loss When Evaluating Models
Testing
Testing_Pytest.py
Testing_unittest.py
Text2Cypher
TF-IDF
The Data Hierarchy of Needs
Thinking Systems
Time Series
Time Series Forecasting
Time Series Identify Trends and Patterns
Tokenisation
TOML
tool.bandit
tool.ruff
tool.uv
Train-Dev-Test Sets
Transaction
Transfer Learning
transfer_learning.py
Transformed Target Regressor
Transformer
Transformers vs RNNs
Types of Computational Bugs
Types of Database Schema
Types of Neural Networks
TypeScript
Typical Output Formats in Neural Networks
Ubuntu
UML
unittest
unstructured data
Unsupervised learning
Untitled
Use Cases for a Simple Neural Network Like
Use of RNNs in energy sector
Using SQLite to Process and Split Combined Data from Excel
vanishing and exploding gradients problem
variance
Vector Database
Vector Embedding
Vector_Embedding.py
Vectorisation
Vectorized Engine
View Use Case
Views
Violin plot
Virtual environments
WCSS and elbow method
Weak Learners
Web Feature Server (WFS)
Web Map Tile Service (WMTS)
What algorithms or models are used within the energy sector
What algorithms or models are used within the telecommunication sector
What are Data Processing Techniques (row-based, columnar, vectorized)?
What are the best practices for evaluating the effectiveness of different prompts
What are the top Cloud Providers?
What can ABM solve within the energy sector
What is a Data Lake?
What is a Data Lakehouse?
What is a Data Product?
What is a Data Warehouse?
What is a Jinja Template?
What is a Lambda Architecture?
What is a Metric?
What is a policy in RL
What is a Push-Down?
What is a Soft Delete?
What is a Storage Layer / Object Store?
What is an In-Memory Format?
What is Apache Airflow?
What is Apache Spark?
What is Business Intelligence
What is Dagster?
What is Data Governance?
What is Data Integration?
What is Data Lineage?
What is Data Literacy?
What is Data Observability?
What is Data Quality?
What is data transformation?
What is declarative?
What is DevOps?
What is ETL?
What is Functional Programming?
What is Granularity
What is imperative?
What is Kubernetes?
What is Machine Learning?
What is MapReduce?
What is Master Data Management (MDM)?
What is Normalization?
What is OLAP (Online Analytical Processing)?
What is Reverse ETL?
What is Schema Evolution?
What is semi-structured data?
What is Slowly Changing Dimension?
What is SQL?
What is structured data?
What is the Big-O Notation?
What is the difference between odds and probability
What is the role of gradient-based optimization in training deep learning models.
What is YAML?
When and why not to us regularisation
Why and when is feature scaling necessary
Why does increasing the number of models in a ensemble not necessarily improve the accuracy
Why does the Adam Optimizer converge
Why is named entity recognition (NER) a challenging task
Why is the Central Limit Theorem important when working with small sample sizes
Why JSON is Better than Pickle for Untrusted Data
Why Type 1 and Type 2 matter
Why Use Views
Wikipedia_API.py
Windows Subsystem for Linux
Word2vec
Word2Vec.py
Wrapper Methods
XGBoost
Z-Normalisation
Z-NormalisationZ-Score
Backlinks
No backlinks found