Data Archive
Search
Search
Dark mode
Light mode
Explorer
pages
Data Archive
DE_Tools
ML_Tools
Quotes
Research Questions
Reviews
standardised
1-on-1 Template
1-to-1's with a Line Manager
AB testing
Accessing Gen AI generated content
Accuracy
ACID Transaction
Activation atlases
Activation Function
Active Learning
Ada boosting
Adam Optimizer
Adaptive Learning Rates
Adding a database to PostgreSQL
Addressing Multicollinearity
Addressing_Multicollinearity.py
Adjusted R squared
Agent Exploration
Agent-based modelling
Agentic Solutions
Aggregation
AI Agents Memory
AI Engineer
AI governance
Algorithms
Altair
altair versus seaborn
Alternatives to Batch Processing
Amazon S3
Anomaly Detection
Anomaly Detection in Time Series
Anomaly Detection with Clustering
Anomaly Detection with Statistical Methods
ANOVA
Apache Airflow
Apache Iceberg
Apache Kafka
Apache Spark
API
API Driven Microservices
ARIMA
Asking questions
Assumption of Normality
Attack mitigation
Attack types
Attention Is All You Need
Attention mechanism
AUC
Automated Feature Creation
AWS Lambda
Azure
Backpropagation
Bag of words
Bag_of_Words.py
Bagging
Bandit example output
Bandit_Example_Fixed.py
Bash
bat
Batch Normalisation
Batch Processing
Bellman Equations
Benefits of Data Transformation
Bernoulli
BERT
BERT Pretraining of Deep Bidirectional Transformers for Language Understanding
BERTScore
Bias and variance
Big Data
big o notation
BigQuery
binary classification
Binder
Boosting
Bootstrap Sampling
Boxplot
business intelligence
Business observability
Business value of anomaly detection
Casual Inference
CatBoost
Central Limit Theorem
Central Limit Theorem & Small Sample Sizes
Chain of thought
Change Management
ChatGPT
Checksum
Chi-Squared Test
Choosing a Threshold
Choosing the Number of Clusters
CI-CD
Class Separability
Classification
Classification Report
Claude
Click_Implementation.py
Cloud Providers
Cluster Density
Cluster Seperation
Clustering
Clustering_Dashboard.py
Clustermap
Code Diagrams
Columnar Storage
Command line
Command Prompt
Common Table Expression
Communication principles
Communication Techniques
Comparing LLMs
Comparing_Ensembles.py
Components of the database
Computer Science
conceptual data model
Conceptual Model
Concurrency
Confidence Interval
Confusion Matrix
Continuous Delivery - Deployment
Continuous Integration
Convolutional Neural Networks
Correlation
Correlation vs Causation
Cosine Similarity
Cost Function
Cost-Sensitive Analysis
Covariance
Covariance Structures
Covariance vs Correlation
Covering Index
Cron jobs
Cross Entropy
Cross validation
Cross_Entropy_Single.py
Cross_Entropy.py
Crosstab
CRUD
Cryptography
csv module
CUDA
Curse of dimensionality
Cypher
dagster
Dash
Dashboarding
Data AI Education at Work
Data Analysis
Data Analysis Portal
Data Analyst
Data Architect
Data Assessment
Data Cleansing
Data Collection
Data Contract
Data Distribution
Data Drift
Data Engineer
Data Engineering
Data Engineering Portal
Data Engineering Tools
data governance
data hierarchy of needs
Data Ingestion
data integration
Data Integrity
Data Lake
Data Lakehouse
Data Leakage
Data Lifecycle Management
data lineage
data literacy
Data Management
Data Mining - CRISP
Data Modelling
Data Observability
Data Orchestration
Data Pipeline
Data Pipeline to Data Products
Data Principles
data product
data quality
Data Reduction
Data Roles
Data Science
Data Scientist
Data Security
Data Selection
Data Selection in ML
Data Steward
Data storage
Data Streaming
Data Transformation
Data transformation in Data Engineering
Data transformation in Machine Learning
Data Transformation with Pandas
Data Validation
data virtualization
Data Visualisation
Data Warehouse
Database
Database Index
Database Management System (DBMS)
Database schema
Database Storage
Database Techniques
Databricks
Databricks vs Snowflake
Datasets
DBScan
dbt
Debugging
Debugging ipynb
Debugging.py
Decision Tree
Declarative Data Pipeline
Deep Learning
Deep Learning Frameworks
Deep Q-Learning
Demand forecasting
Dendrograms
dependency manager
Design Thinking Questions
Determining Threshold Values
DevOps
Differentation
Digital Transformation
Digital twin
Dimension Table
Dimensional Modelling
Dimensionality Reduction
dimensions
Directed Acyclic Graph (DAG)
Distillation
Distributed Computing
Distribution_Analysis.py
Distributions
Docker
Docker Image
documentation
Documentation & Meetings
Dropout
DS & ML Portal
duckdb
DuckDB in python
DuckDB vs SQLite
Dummy variable trap
EDA
Edge ML
Education and Training
Elastic Net
ElasticSearch
ELT
Embedded Methods
embeddings for OOV words
emergent behavior
Encoding Categorical Variables
Energy
Energy ABM
Energy Storage
Environment Variables
Epoch
Epub
ER Diagrams
Estimator
ETL
ETL Pipeline example
etl vs elt
etlt
Evaluate Embedding Methods
Evaluating Language Models
Evaluating the effectiveness of prompts
Evaluation Metrics
Event Driven
Event Driven Events
Event Driven Microservices
Event-Driven Architecture
Everything
Excel
Excel pivot table
Excel vs Google Sheets
Experiment Plan Template
Exploration vs Exploitation
f-regression
F-statistic
F1 Score
Fabric
fact table
Factor Analysis
Factor_Analysis.py
facts
FAISS
FastAPI
FastAPI_Example.py
Feature Engineering
Feature Evaluation
Feature Extraction
Feature Importance
Feature Scaling
Feature Selection
Feature Selection vs Feature Importance
Feature_Distribution.py
Feed Forward Neural Network
Feedback Template
File Management
Filter method
filter methods
Firebase
Fishbone diagram
Fitting weights and biases of a neural network
Flask
Folder Tree Diagram
Forecasting_AutoArima.py
Forecasting_Baseline.py
Forecasting_Exponential_Smoothing.py
Foreign Key
Forward Propagation
frontend
functional programming
Fuzzywuzzy
garbage collector
Gartner Hype Cycle
Gaussian Distribution
Gaussian Mixture Models
Gaussian Model
gaussian_mixture_model_implementation.py
General Linear Regression
Generative Adversarial Networks
Generative AI
Generative AI From Theory to Practice
Generators in Python
Gini Impurity
Gini Impurity vs Cross Entropy
GIS
Git
Gitlab
gitlab-ci.yml
Global Interpreter Lock
Google Cloud Platform
Google Colab
Google My Maps Data Extraction
Google OR Tools
Google Sheet Pivots Table
Google Sheets
GPT
Gradient Boosting
Gradient Boosting Regressor
Gradient Descent
Gradient descent in linear regression
Gradio
Grain
Grammar method
granularity
Graph Neural Network
Graph Query Language
Graph Theory
Graph Theory Community
GraphRAG
Grep
GridSeachCv
Groupby
Groupby vs Crosstab
Grouped plots
GRU
Guardrails
Hadoop
Handling Different Distributions
Handling Missing Data
Handling_Missing_Data_Basic.ipynb
Handling_Missing_Data.ipynb
Hash
Heap Data Structure
Heap Memory
Heatmap
Heatmaps_Dendrograms.py
heterogeneous features
Hierarchical Clustering
High cross validation accuracy is not directly proportional to performance on unseen test data
Honkit
Hosting
How businesses use Gen AI
How do we evaluate of LLM Outputs
how do you do the data selection
How is reinforcement learning being combined with deep learning
How is schema evolution done in practice with SQL
How LLMs store facts
How to do git commit messages properly
How to normalise a merged table
How to reduce the need for Gen AI responses
How to search within a graph
How to use Sklearn Pipeline
How would you decide between using TF-IDF and Word2Vec for text vectorization
html
Hugging Face
Hyperparameter
Hyperparameter Tuning
Hypothesis testing
Imbalanced Datasets
Imbalanced_Datasets_SMOTE.py
Immutable vs mutable
Impact of multicollinearity on model parameters
imperative
Implementing Database Schema
Imputation Techniques
In NER how would you handle ambiguous entities
in-memory format
incremental synchronization
Indexing in cypher
Industries of interest
Inertia K Means Cost Function
inference
inference versus prediction
information theory
initialization methods
Input is Not Properly Sanitized
Interoperability
interoperable
interpretability
Interpreting logistic regression model parameters
Interquartile Range (IQR) Detection
ipynb
Isolated Forest
Java
Java vs JavaScript
JavaScript
jinja template
Jobs to be done
Johnson–Lindenstrauss lemma
Joining Datasets
Json
Json to SQLite
Junction Tables
Jupyter Book
jupytext
Justfile
K_Means.py
K-means
K-nearest neighbours
Keras
Kernel Density Estimation
Kernelling
Key Components of Attention and Formula
Kmeans vs GMM
KNIME
Knowledge Graph
Knowledge graph vs RAG setup
Knowledge Work
kubernetes
Label encoding
Label encoding vs One-hot encoding
Labelling data
lambda architecture
Langchain
Language Model Output Optimisation
Language Models
Language Models Large (LLMs) vs Small (SLMs)
Lasso
Latency
Latent Dirichlet Allocation
LBFGS
Learning Curve
learning rate
Learning Styles
lemmatization
LightGBM
LightGBM vs XGBoost vs CatBoost
Linear Discriminant Analysis
Linear Regression
Linked List
LLM
LLM Evaluation Metrics
LLM Memory
Load Balancing
Local Interpretable Model-agnostic Explainations
Local Outlier Factor (LOF)
Log transformation
Logical Model
Logistic Regression
Logistic Regression does not predict probabilities
Logistic regression in sklearn & Gradient Descent
Logistic Regression Statsmodel Summary table
Looker Studio
loss function
Loss versus Cost function
LSTM
Machine Learning
Machine Learning Algorithms
Machine Learning Operations
maintainability
Maintainable Code
Makefile
Manifold learning
Many-to-Many Relationships
map reduce
Markov chain
Markov Decision Processes
master data management
Master Observability Datadog
Mathematical Reasoning in Transformers
Mathematics
Maximum Likelihood Estimation
mean absolute error
Mean Squared Error
mean vs median
melt
Memory
Memory Caching
Merge
Mermaid
Metadata Handling
Methods for Handling Outliers
metric
Microsoft
Microsoft Access
Mini-batch gradient descent
Mixture of Experts
ML Engineer
MNIST
Model Building
Model Cascading
Model Deployment
Model Ensemble
Model Evaluation
Model Evaluation vs Model Optimisation
Model Interpretability
Model Observability
Model Optimisation
Model Parameters
Model Parameters Tuning
Model parameters vs hyperparameters
Model Selection
Model Validation
model-agnostic feature importance
Momentum
Momentum.py
MongoDB
Monolith Architecture
Monte Carlo Simulation
Multi-Agent Reinforcement Learning
Multi-head attention
Multi-level index
Multicollinearity
Multinomial Naive bayes
Multiprocessing
Multiprocessing vs Multithreading
Multithreading
MySql
Naive Bayes
Named Entity Recognition
nbconvert
nbconvert slideshows
neo4j
neomodel
NET
Network Design
Neural network
Neural Network Classification
Neural network in Practice
Neural Scaling Laws
Ngrams
NLP
nltk
Node.JS
non-parametric
Non-parametric tests
Normalisation
Normalisation of data
Normalisation of Text
Normalisation vs Standardisation
Normalised Schema
NoSQL
NotebookLM
npy Files A NumPy Array storage
Numpy
Object Relational Mapper
Odds
Odds vs Probability
OLAP (online analytical processing)
OLTP
One Pager Template
One_hot_encoding.py
One-hot encoding
OOV words
Operational Resilience for Growth and Adaptability
Optimisation function
Optimisation techniques
Optimising a Logistic Regression Model
Optimising Neural Networks
Optuna
Ordinary Least Squares
Orthogonalization
Outliers
Over parameterised models
Overfitting
p values
Page Rank
Pandas
Pandas Dataframe Agent
Pandas join vs merge
Pandas Pivot Table
Pandas Stack
Pandas_Common.py
Pandas_Stack.py
Pandoc
Parametric tests
parametric vs non-parametric models
parametric vs non-parametric tests
Parquet
parsimonious
Part of speech tagging
PCA Explained Variance Ratio
PCA Principal Components
PCA_Analysis.ipynb
PCA_Based_Anomaly_Detection.py
PCA-Based Anomaly Detection
pd.Grouper
pdoc
PDP and ICE
Percentile Detection
Performance Dimensions
Performance Drift
Physical Model
Pickle
Plotly
pmdarima
Poetry
Policy
Polynomial Regression
Positional Encoding
PostgreSQL
Postman
PowerBI
Powerquery
PowerShell
Powershell scripts
Powershell versus Command Prompt
Powershell vs Bash
Precision
Precision or Recall
Precision-Recall Curve
Prediction Intervals
Preprocessing
Prevention Is Better Than the Cure
Primary Key
Principal Component Analysis
Probability
Problem Definition
Process Based Parallelism
Processes vs Threads
programming languages
Project Management Portal
Prompt engineering
prompt retrievers
Prompts
Proportion Test
Publish and Subscribe
Pull Request Template
push-down
PyCaret
Pycaret_Anomaly.ipynb
Pycaret_Example.py
Pydantic
Pydantic_More.py
Pydantic.py
PyGraphviz
PyOD
Pyright
Pyright vs Pydantic
PySpark
Pytest
Python
Python Click
PyTorch
Pytorch vs Tensorflow
Q-Learning
Q-Q Plot
Quartz
Query Optimisation
Querying
QuickSort
R
R squared
R-squared metric not always a good indicator of model performance in regression
Race Conditions
RAG
Random Access Memory
Random Forest Regression
Random Forests
React
Reasoning tokens
Recall
Recommender systems
Recurrent Neural Networks
Recursive Algorithm
Registering a Scheduled Task
Regression
Regression metrics
Regression_Logistic_Metrics.ipynb
Regularisation
Regularisation of Tree based models
Regularisation.py
Reinforcement learning
Relating Tables Together
Relational Database
Relationships in memory
Relu
REST API
Reveal.js
reverse etl
Reward Function
Ridge
ROC (Receiver Operating Characteristic)
ROC_Curve.py
rollup
Root Mean Squared Error
Row-based Storage
Sampling
Sarsa
Scala
Scalability
Scaling Agentic Systems
Scaling Data Science Capability
Scaling Server
Scatter Plots
schema evolution
Scientific Method
Scikit-Learn
Scipy
Seaborn
search
Security mitigation
Security Researcher
Security Vulnerabilities
Self Attention
Self attention vs multi-head attention
Self-Attention
semantic layer
Semantic Relationships
Semantic search
semi-structured data
Sentence Similarity
Sentence Transformer Workflow
Sentence Transformers
shapefile
SHapley Additive exPlanations
Sharepoint
Silhouette Analysis
Similarity Search
Single source of truth
sklearn datasets
Sklearn Pipiline
Slowly Changing Dimension
Small Language Models
Smart Grids
SMOTE (Synthetic Minority Over-sampling Technique)
SMSS
Snowflake
Snowflake Schema
Snowflake vs Hadoop
Soft Deletion
Software Design Patterns
Software Development Life Cycle
Software Development Portal
spaCy
SparseCategorialCrossentropy or CategoricalCrossEntropy
Spearman vs Pearson Correlation
Specificity
Spreadsheets vs Databases
SQL
SQL Groupby
SQL Injection
SQL Joins
SQL vs NoSQL
SQL Window functions
SQLAlchemy
SQLAlchemy vs. sqlite3
SQLite
SQLite Studio
stack memory
Stacking
Standard deviation
Standardisation
Star Schema
Statistical Assumptions
Statistical Tests
Statistical theorems
Statistics
Stemming
Stochastic Gradient Descent
storage layer object store
Stored Procedures
Streamlit
Strongly vs Weakly typed language
structured data
Structuring and organizing data
Summarisation
Supervised Learning
Support Vector Classifier
Support Vector Machines
Support Vector Regression
SVM_Example.py
Symbolic computation
Sympy
syntactic relationships
t-SNE
T-test
Tableau
Technical Debt
Technical Design Doc Template
Telecommunications
Tensorflow
Terminal commands
Test Loss When Evaluating Models
Testing
Testing_Pytest.py
Testing_unittest.py
Text2Cypher
TF-IDF
TF-IDF Implementation
Thinking Systems
Time Series
Time Series Forecasting
Time Series Identify Trends and Patterns
Tokenisation
TOML
tool.bandit
tool.ruff
tool.uv
topic modeling
Train-Dev-Test Sets
Transaction
Transfer Learning
transfer_learning.py
Transformed Target Regressor
Transformer
Transformers vs RNNs
TS_Anomaly_Detection.py
Turning a flat file into a database
Type I Error (False Positive)
Type II Error (False Negative)
Types of Computational Bugs
Types of Database Schema
Types of Neural Networks
TypeScript
Typical Output Formats in Neural Networks
Ubuntu
UMAP
UML
unittest
univariate vs multivariate
Unix
unstructured data
Unsupervised learning
Use Cases for a Simple Neural Network Like
Use of RNNs in energy sector
Vacuum
vanishing and exploding gradients problem
Variability in linear models
variance
Vector Database
Vector Embedding
Vector_Embedding.py
Vectorisation
Vectorized Engine
Vercel
View Use Case
Views
Violin plot
Virtual environments
WCSS and elbow method
Weak Learners
Web Feature Server (WFS)
Web Map Tile Service (WMTS)
When and why not to us regularisation
Why does increasing the number of models in a ensemble not necessarily improve the accuracy
Why does the Adam Optimizer converge
Why is named entity recognition (NER) a challenging task
Why JSON is Better than Pickle for Untrusted Data
Why Removing Outliers May Improve Regression but Harm Classification
Why standardise features
Why Type 1 and Type 2 matter
Why use ER diagrams
Wikipedia_API.py
Windows Scheduled Tasks
Windows Subsystem for Linux
Word2vec
Word2Vec.py
WordNet
Wrapper Methods
Xaiver
XGBoost
yaml
Z-Normalisation
Z-Score
Z-Scores vs Prediction Intervals
Z-Test
Home
❯
standardised
Folder: standardised
894 items under this folder.
05 Aug 2025
vanishing and exploding gradients problem
deep_learning
ml
optimisation
05 Aug 2025
variance
statistics
05 Aug 2025
yaml
software
05 Aug 2025
unstructured data
modeling
storage
05 Aug 2025
transfer_learning.py
blank
05 Aug 2025
unittest
blank
05 Aug 2025
univariate vs multivariate
statistics
05 Aug 2025
structured data
modeling
storage
05 Aug 2025
syntactic relationships
language_models
05 Aug 2025
t-SNE
visualization
05 Aug 2025
tool.bandit
security
python
documentation
05 Aug 2025
tool.ruff
security
05 Aug 2025
tool.uv
software
05 Aug 2025
topic modeling
language_models
NLP
05 Aug 2025
spaCy
NLP
python
05 Aug 2025
stack memory
memory_management
05 Aug 2025
storage layer object store
storage
05 Aug 2025
semantic layer
database
storage
05 Aug 2025
semi-structured data
modeling
storage
05 Aug 2025
shapefile
file_type
05 Aug 2025
sklearn datasets
exploration
05 Aug 2025
pd.Grouper
transformation
05 Aug 2025
pdoc
documentation
python
05 Aug 2025
pmdarima
time_series
05 Aug 2025
programming languages
programming
05 Aug 2025
prompt retrievers
language_models
05 Aug 2025
push-down
database
05 Aug 2025
reverse etl
transformation
05 Aug 2025
rollup
database
05 Aug 2025
schema evolution
database
05 Aug 2025
search
exploration
05 Aug 2025
parametric vs non-parametric tests
test
statistics
05 Aug 2025
parsimonious
selection
statistics
05 Aug 2025
model-agnostic feature importance
modeling
explainability
05 Aug 2025
nbconvert slideshows
documentation
communication
05 Aug 2025
nbconvert
documentation
software
tool
05 Aug 2025
neo4j
database
05 Aug 2025
neomodel
python
graph
05 Aug 2025
nltk
NLP
05 Aug 2025
non-parametric
statistics
05 Aug 2025
npy Files A NumPy Array storage
file_type
05 Aug 2025
p values
statistics
05 Aug 2025
parametric vs non-parametric models
statistics
05 Aug 2025
maintainability
management
05 Aug 2025
map reduce
cleaning
05 Aug 2025
master data management
governance
management
storage
05 Aug 2025
mean absolute error
math
05 Aug 2025
mean vs median
statistics
visualization
05 Aug 2025
melt
transformation
05 Aug 2025
metric
business
05 Aug 2025
learning rate
optimisation
05 Aug 2025
lemmatization
NLP
05 Aug 2025
loss function
architecture
deep_learning
optimisation
05 Aug 2025
kubernetes
orchestration
software
05 Aug 2025
lambda architecture
modeling
orchestration
05 Aug 2025
information theory
math
05 Aug 2025
initialization methods
deep_learning
optimisation
05 Aug 2025
interoperable
explainability
05 Aug 2025
interpretability
drafting
explainability
05 Aug 2025
ipynb
software
05 Aug 2025
jinja template
software
05 Aug 2025
jupytext
communication
software
05 Aug 2025
imperative
orchestration
05 Aug 2025
in-memory format
storage
system
05 Aug 2025
incremental synchronization
management
05 Aug 2025
inference versus prediction
GenAI
ml
05 Aug 2025
inference
ml
05 Aug 2025
gitlab-ci.yml
file_type
05 Aug 2025
granularity
database
modeling
05 Aug 2025
heterogeneous features
cleaning
05 Aug 2025
how do you do the data selection
selection
05 Aug 2025
html
documentation
05 Aug 2025
filter methods
statistics
05 Aug 2025
frontend
frontend
05 Aug 2025
functional programming
software
05 Aug 2025
garbage collector
system
05 Aug 2025
gaussian_mixture_model_implementation.py
blank
05 Aug 2025
emergent behavior
ml
05 Aug 2025
etl vs elt
transformation
05 Aug 2025
etlt
transformation
05 Aug 2025
f-regression
explainability
statistics
05 Aug 2025
fact table
database
modeling
05 Aug 2025
facts
business
05 Aug 2025
data product
analysis
business
management
05 Aug 2025
data quality
data_quality
05 Aug 2025
data virtualization
business
orchestration
05 Aug 2025
dbt
tool
transformation
05 Aug 2025
dependency manager
devops
system
05 Aug 2025
dimensions
modeling
05 Aug 2025
documentation
documentation
communication
05 Aug 2025
duckdb
database
analysis
05 Aug 2025
embeddings for OOV words
NLP
optimisation
05 Aug 2025
data lineage
management
05 Aug 2025
data literacy
business
05 Aug 2025
Why Type 1 and Type 2 matter
classifier
evaluation
05 Aug 2025
Why does increasing the number of models in a ensemble not necessarily improve the accuracy
modeling
05 Aug 2025
Why does the Adam Optimizer converge
optimisation
05 Aug 2025
Why is named entity recognition (NER) a challenging task
language_models
05 Aug 2025
Why standardise features
ml
preprocessing
05 Aug 2025
Why use ER diagrams
documentation
05 Aug 2025
Wikipedia_API.py
blank
05 Aug 2025
Windows Scheduled Tasks
software
05 Aug 2025
Windows Subsystem for Linux
system
05 Aug 2025
Word2Vec.py
blank
05 Aug 2025
Word2vec
NLP
python
05 Aug 2025
WordNet
NLP
05 Aug 2025
Wrapper Methods
optimisation
05 Aug 2025
XGBoost
optimisation
05 Aug 2025
Xaiver
deep_learning
optimisation
05 Aug 2025
Z-Normalisation
preprocessing
transformation
05 Aug 2025
Z-Score
statistics
05 Aug 2025
Z-Scores vs Prediction Intervals
statistics
05 Aug 2025
Z-Test
statistics
05 Aug 2025
altair versus seaborn
visualization
05 Aug 2025
bat
file_type
05 Aug 2025
big o notation
math
05 Aug 2025
binary classification
classifier
ml
05 Aug 2025
business intelligence
business
05 Aug 2025
conceptual data model
modeling
05 Aug 2025
csv module
python
05 Aug 2025
dagster
orchestration
05 Aug 2025
data governance
governance
business
05 Aug 2025
data hierarchy of needs
management
05 Aug 2025
data integration
orchestration
storage
05 Aug 2025
Vectorized Engine
querying
05 Aug 2025
Vercel
devops
05 Aug 2025
View Use Case
explainability
cleaning
transformation
05 Aug 2025
Views
database
05 Aug 2025
Violin plot
statistics
05 Aug 2025
Virtual environments
software
05 Aug 2025
WCSS and elbow method
clustering
05 Aug 2025
Weak Learners
modeling
05 Aug 2025
Web Feature Server (WFS)
file_type
05 Aug 2025
Web Map Tile Service (WMTS)
file_type
05 Aug 2025
When and why not to us regularisation
optimisation
exploration
data_quality
05 Aug 2025
Why JSON is Better than Pickle for Untrusted Data
file_type
05 Aug 2025
Why Removing Outliers May Improve Regression but Harm Classification
anomaly_detection
05 Aug 2025
Turning a flat file into a database
database
05 Aug 2025
Type I Error (False Positive)
evaluation
05 Aug 2025
Type II Error (False Negative)
evaluation
05 Aug 2025
TypeScript
software
05 Aug 2025
Types of Computational Bugs
devops
software
05 Aug 2025
Types of Database Schema
database
management
05 Aug 2025
Types of Neural Networks
deep_learning
ml
05 Aug 2025
Typical Output Formats in Neural Networks
deep_learning
algorithm
exploration
05 Aug 2025
UMAP
visualization
explainability
05 Aug 2025
UML
documentation
modeling
05 Aug 2025
Ubuntu
system
05 Aug 2025
Unix
system
05 Aug 2025
Unsupervised learning
clustering
field
05 Aug 2025
Use Cases for a Simple Neural Network Like
deep_learning
05 Aug 2025
Use of RNNs in energy sector
anomaly_detection
deep_learning
energy
time_series
05 Aug 2025
Vacuum
database
memory_management
05 Aug 2025
Variability in linear models
math
05 Aug 2025
Vector Database
database
05 Aug 2025
Vector Embedding
language_models
math
05 Aug 2025
Vector_Embedding.py
blank
05 Aug 2025
Vectorisation
code_snippet
software
05 Aug 2025
Testing_Pytest.py
blank
05 Aug 2025
Testing_unittest.py
blank
05 Aug 2025
Text2Cypher
querying
NLP
05 Aug 2025
Thinking Systems
business
career
05 Aug 2025
Time Series Forecasting
time_series
05 Aug 2025
Time Series Identify Trends and Patterns
time_series
05 Aug 2025
Time Series
modeling
05 Aug 2025
Tokenisation
code_snippet
NLP
preprocessing
05 Aug 2025
Train-Dev-Test Sets
modeling
05 Aug 2025
Transaction
database
05 Aug 2025
Transfer Learning
modeling
05 Aug 2025
Transformed Target Regressor
regressor
transformation
05 Aug 2025
Transformer
deep_learning
NLP
05 Aug 2025
Transformers vs RNNs
deep_learning
05 Aug 2025
T-test
statistics
05 Aug 2025
TF-IDF Implementation
NLP
05 Aug 2025
TF-IDF
code_snippet
NLP
preprocessing
05 Aug 2025
TOML
file_type
documentation
05 Aug 2025
TS_Anomaly_Detection.py
blank
05 Aug 2025
Tableau
visualization
05 Aug 2025
Technical Debt
business
documentation
software
05 Aug 2025
Technical Design Doc Template
documentation
05 Aug 2025
Telecommunications
career
05 Aug 2025
Tensorflow
deep_learning
software
05 Aug 2025
Terminal commands
software
05 Aug 2025
Test Loss When Evaluating Models
evaluation
05 Aug 2025
Testing
python
software
05 Aug 2025
Standard deviation
math
communication
05 Aug 2025
Standardisation
preprocessing
05 Aug 2025
Star Schema
database
05 Aug 2025
Statistical Assumptions
explainability
selection
05 Aug 2025
Statistical Tests
statistics
05 Aug 2025
Statistical theorems
statistics
05 Aug 2025
Statistics
portal
statistics
05 Aug 2025
Stemming
NLP
05 Aug 2025
Stochastic Gradient Descent
math
statistics
05 Aug 2025
Stored Procedures
algorithm
database
05 Aug 2025
Streamlit
exploration
ml
software
visualization
05 Aug 2025
Strongly vs Weakly typed language
software
05 Aug 2025
Structuring and organizing data
transformation
05 Aug 2025
Summarisation
NLP
05 Aug 2025
Supervised Learning
field
05 Aug 2025
Support Vector Classifier
classifier
05 Aug 2025
Support Vector Machines
classifier
clustering
05 Aug 2025
Support Vector Regression
algorithm
regressor
05 Aug 2025
Symbolic computation
math
05 Aug 2025
Sympy
math
05 Aug 2025
Smart Grids
energy
05 Aug 2025
Snowflake Schema
database
orchestration
05 Aug 2025
Snowflake vs Hadoop
architecture
storage
05 Aug 2025
Snowflake
architecture
05 Aug 2025
Soft Deletion
database
management
05 Aug 2025
Software Design Patterns
architecture
communication
software
05 Aug 2025
Software Development Life Cycle
orchestration
05 Aug 2025
Software Development Portal
portal
05 Aug 2025
SparseCategorialCrossentropy or CategoricalCrossEntropy
deep_learning
classifier
05 Aug 2025
Spearman vs Pearson Correlation
analysis
statistics
05 Aug 2025
Specificity
evaluation
05 Aug 2025
Spreadsheets vs Databases
management
storage
05 Aug 2025
Stacking
05 Aug 2025
Scipy
python
05 Aug 2025
Seaborn
python
visualization
05 Aug 2025
Security Researcher
security
05 Aug 2025
Security Vulnerabilities
software
05 Aug 2025
Security mitigation
security
05 Aug 2025
Self Attention
NLP
05 Aug 2025
Self attention vs multi-head attention
NLP
05 Aug 2025
Self-Attention
NLP
devops
architecture
05 Aug 2025
Semantic Relationships
language_models
NLP
05 Aug 2025
Semantic search
GenAI
05 Aug 2025
Sentence Similarity
NLP
05 Aug 2025
Sentence Transformer Workflow
process
NLP
05 Aug 2025
Sentence Transformers
deep_learning
NLP
05 Aug 2025
Sharepoint
cloud
communication
documentation
05 Aug 2025
Silhouette Analysis
analysis
clustering
05 Aug 2025
Similarity Search
language_models
NLP
05 Aug 2025
Single source of truth
management
storage
05 Aug 2025
Sklearn Pipiline
code_snippet
transformation
05 Aug 2025
Slowly Changing Dimension
database
05 Aug 2025
Small Language Models
language_models
NLP
05 Aug 2025
SQLAlchemy
python
database
05 Aug 2025
SQLite Studio
database
software
05 Aug 2025
SQLite
database
05 Aug 2025
SVM_Example.py
blank
05 Aug 2025
Sampling
statistics
05 Aug 2025
Sarsa
deep_learning
05 Aug 2025
Scala
software
05 Aug 2025
Scalability
management
05 Aug 2025
Scaling Agentic Systems
GenAI
language_models
05 Aug 2025
Scaling Data Science Capability
business
05 Aug 2025
Scaling Server
devops
05 Aug 2025
Scatter Plots
visualization
05 Aug 2025
Scientific Method
drafting
field
05 Aug 2025
Scikit-Learn
analysis
05 Aug 2025
Recall
evaluation
05 Aug 2025
Recommender systems
evaluation
modeling
05 Aug 2025
Recurrent Neural Networks
deep_learning
time_series
05 Aug 2025
Recursive Algorithm
algorithm
05 Aug 2025
Registering a Scheduled Task
software
05 Aug 2025
Regression metrics
code_snippet
evaluation
05 Aug 2025
Regression
regressor
statistics
05 Aug 2025
Regression_Logistic_Metrics.ipynb
blank
05 Aug 2025
Regularisation of Tree based models
evaluation
explainability
optimisation
05 Aug 2025
Regularisation
explainability
optimisation
process
visualization
05 Aug 2025
Regularisation.py
blank
05 Aug 2025
Reinforcement learning
field
ml
05 Aug 2025
Relating Tables Together
database
05 Aug 2025
Relational Database
database
05 Aug 2025
Relationships in memory
language_models
memory_management
05 Aug 2025
Relu
deep_learning
05 Aug 2025
Reveal.js
communication
05 Aug 2025
Reward Function
deep_learning
05 Aug 2025
Ridge
drafting
05 Aug 2025
Root Mean Squared Error
statistics
05 Aug 2025
Row-based Storage
storage
05 Aug 2025
SHapley Additive exPlanations
modeling
explainability
05 Aug 2025
SMOTE (Synthetic Minority Over-sampling Technique)
05 Aug 2025
SMSS
database
05 Aug 2025
SQL Groupby
querying
transformation
05 Aug 2025
SQL Injection
SQL
05 Aug 2025
SQL Joins
SQL
05 Aug 2025
SQL Window functions
analysis
querying
05 Aug 2025
SQL vs NoSQL
modeling
storage
05 Aug 2025
SQL
database
software
05 Aug 2025
SQLAlchemy vs. sqlite3
database
python
05 Aug 2025
React
frontend
05 Aug 2025
Reasoning tokens
GenAI
math
05 Aug 2025
Pyright vs Pydantic
python
data_quality
05 Aug 2025
Pyright
prompt
05 Aug 2025
Pytest
devops
python
05 Aug 2025
Python Click
python
05 Aug 2025
Python
python
05 Aug 2025
Pytorch vs Tensorflow
deep_learning
ml
05 Aug 2025
Q-Learning
algorithm
regressor
05 Aug 2025
Q-Q Plot
visualization
statistics
05 Aug 2025
Quartz
software
05 Aug 2025
Query Optimisation
querying
optimisation
05 Aug 2025
Querying
analysis
database
exploration
05 Aug 2025
QuickSort
algorithm
05 Aug 2025
R squared
statistics
05 Aug 2025
R-squared metric not always a good indicator of model performance in regression
statistics
05 Aug 2025
R
statistics
05 Aug 2025
RAG
language_models
05 Aug 2025
REST API
devops
05 Aug 2025
ROC (Receiver Operating Characteristic)
evaluation
05 Aug 2025
ROC_Curve.py
blank
05 Aug 2025
Race Conditions
database
devops
05 Aug 2025
Random Access Memory
system
05 Aug 2025
Random Forest Regression
regressor
ml
05 Aug 2025
Random Forests
classifier
drafting
05 Aug 2025
Prevention Is Better Than the Cure
data_quality
05 Aug 2025
Primary Key
database
05 Aug 2025
Principal Component Analysis
cleaning
visualization
05 Aug 2025
Probability
statistics
05 Aug 2025
Problem Definition
communication
05 Aug 2025
Process Based Parallelism
system
05 Aug 2025
Processes vs Threads
software
05 Aug 2025
Project Management Portal
portal
05 Aug 2025
Prompt engineering
language_models
NLP
05 Aug 2025
Prompts
prompt
05 Aug 2025
Proportion Test
statistics
test
05 Aug 2025
Publish and Subscribe
devops
orchestration
05 Aug 2025
Pull Request Template
documentation
05 Aug 2025
PyCaret
python
software
ml
05 Aug 2025
PyGraphviz
graph
05 Aug 2025
PyOD
anomaly_detection
05 Aug 2025
PySpark
orchestration
python
05 Aug 2025
PyTorch
deep_learning
python
05 Aug 2025
Pycaret_Anomaly.ipynb
blank
05 Aug 2025
Pycaret_Example.py
blank
05 Aug 2025
Pydantic
python
management
05 Aug 2025
Pydantic.py
blank
05 Aug 2025
Pydantic_More.py
blank
05 Aug 2025
Percentile Detection
anomaly_detection
05 Aug 2025
Performance Dimensions
portal
05 Aug 2025
Performance Drift
data_quality
evaluation
explainability
05 Aug 2025
Physical Model
documentation
SQL
database
05 Aug 2025
Pickle
file_type
storage
05 Aug 2025
Plotly
python
visualization
05 Aug 2025
Poetry
system
management
05 Aug 2025
Policy
question
05 Aug 2025
Polynomial Regression
ml
modeling
05 Aug 2025
Positional Encoding
deep_learning
NLP
05 Aug 2025
PostgreSQL
database
management
05 Aug 2025
Postman
devops
05 Aug 2025
PowerBI
software
visualization
05 Aug 2025
PowerShell
software
system
05 Aug 2025
Powerquery
software
05 Aug 2025
Powershell scripts
system
05 Aug 2025
Powershell versus Command Prompt
software
system
05 Aug 2025
Powershell vs Bash
system
05 Aug 2025
Precision or Recall
evaluation
05 Aug 2025
Precision-Recall Curve
evaluation
05 Aug 2025
Precision
evaluation
05 Aug 2025
Prediction Intervals
statistics
05 Aug 2025
Preprocessing
optimisation
cleaning
portal
preprocessing
transformation
05 Aug 2025
Over parameterised models
explainability
modeling
05 Aug 2025
Overfitting
architecture
05 Aug 2025
PCA Explained Variance Ratio
explainability
05 Aug 2025
PCA Principal Components
ml
explainability
05 Aug 2025
PCA-Based Anomaly Detection
anomaly_detection
05 Aug 2025
PCA_Analysis.ipynb
blank
05 Aug 2025
PCA_Based_Anomaly_Detection.py
blank
05 Aug 2025
PDP and ICE
evaluation
05 Aug 2025
Page Rank
graph
visualization
05 Aug 2025
Pandas Dataframe Agent
agents
05 Aug 2025
Pandas Pivot Table
transformation
python
05 Aug 2025
Pandas Stack
transformation
05 Aug 2025
Pandas join vs merge
cleaning
transformation
05 Aug 2025
Pandas
python
transformation
05 Aug 2025
Pandas_Common.py
blank
05 Aug 2025
Pandas_Stack.py
blank
05 Aug 2025
Pandoc
documentation
tool
05 Aug 2025
Parametric tests
statistics
test
05 Aug 2025
Parquet
storage
05 Aug 2025
Part of speech tagging
NLP
05 Aug 2025
Normalisation vs Standardisation
ml
statistics
05 Aug 2025
Normalisation
portal
05 Aug 2025
Normalised Schema
database
05 Aug 2025
NotebookLM
tool
05 Aug 2025
Numpy
python
data_structure
05 Aug 2025
OLAP (online analytical processing)
analysis
tool
database
05 Aug 2025
OLTP
management
05 Aug 2025
OOV words
NLP
05 Aug 2025
Object Relational Mapper
SQL
programming
05 Aug 2025
Odds vs Probability
math
statistics
05 Aug 2025
Odds
statistics
05 Aug 2025
One Pager Template
documentation
05 Aug 2025
One-hot encoding
transformation
preprocessing
05 Aug 2025
One_hot_encoding.py
blank
05 Aug 2025
Operational Resilience for Growth and Adaptability
business
05 Aug 2025
Optimisation function
optimisation
selection
05 Aug 2025
Optimisation techniques
optimisation
process
05 Aug 2025
Optimising Neural Networks
optimisation
deep_learning
05 Aug 2025
Optimising a Logistic Regression Model
optimisation
ml
05 Aug 2025
Optuna
optimisation
05 Aug 2025
Ordinary Least Squares
ml
05 Aug 2025
Orthogonalization
explainability
ml
05 Aug 2025
Outliers
anomaly_detection
cleaning
statistics
05 Aug 2025
Multiprocessing
system
05 Aug 2025
Multithreading
programming
system
05 Aug 2025
MySql
database
management
05 Aug 2025
NET
software
file_type
05 Aug 2025
NLP
NLP
05 Aug 2025
Naive Bayes
classifier
05 Aug 2025
Named Entity Recognition
modeling
NLP
05 Aug 2025
Network Design
energy
05 Aug 2025
Neural Network Classification
deep_learning
05 Aug 2025
Neural Scaling Laws
drafting
05 Aug 2025
Neural network in Practice
deep_learning
05 Aug 2025
Neural network
deep_learning
drafting
05 Aug 2025
Ngrams
NLP
05 Aug 2025
NoSQL
SQL
05 Aug 2025
Node.JS
programming
05 Aug 2025
Non-parametric tests
evaluation
statistics
05 Aug 2025
Normalisation of Text
code_snippet
NLP
05 Aug 2025
Normalisation of data
data_quality
modeling
statistics
05 Aug 2025
Multi-Agent Reinforcement Learning
deep_learning
agents
05 Aug 2025
Multi-head attention
deep_learning
NLP
05 Aug 2025
Multi-level index
transformation
05 Aug 2025
Multicollinearity
statistics
05 Aug 2025
Multinomial Naive bayes
classifier
statistics
ml
05 Aug 2025
Multiprocessing vs Multithreading
programming
05 Aug 2025
Methods for Handling Outliers
anomaly_detection
preprocessing
05 Aug 2025
Microsoft Access
database
software
05 Aug 2025
Microsoft
cloud
05 Aug 2025
Mini-batch gradient descent
math
ml
05 Aug 2025
Mixture of Experts
language_models
05 Aug 2025
Model Building
modeling
selection
05 Aug 2025
Model Cascading
language_models
05 Aug 2025
Model Deployment
architecture
05 Aug 2025
Model Ensemble
architecture
modeling
05 Aug 2025
Model Evaluation vs Model Optimisation
evaluation
optimisation
05 Aug 2025
Model Evaluation
evaluation
modeling
05 Aug 2025
Model Interpretability
explainability
05 Aug 2025
Model Observability
explainability
modeling
05 Aug 2025
Model Optimisation
drafting
05 Aug 2025
Model Parameters Tuning
optimisation
selection
05 Aug 2025
Model Parameters
modeling
optimisation
05 Aug 2025
Model Selection
evaluation
process
05 Aug 2025
Model Validation
modeling
evaluation
05 Aug 2025
Model parameters vs hyperparameters
modeling
05 Aug 2025
Momentum
optimisation
05 Aug 2025
Momentum.py
blank
05 Aug 2025
MongoDB
system
05 Aug 2025
Monolith Architecture
architecture
05 Aug 2025
Monte Carlo Simulation
statistics
algorithm
05 Aug 2025
LightGBM
optimisation
05 Aug 2025
Linear Discriminant Analysis
analysis
05 Aug 2025
Linear Regression
regressor
05 Aug 2025
Linked List
code_snippet
data_structure
05 Aug 2025
Load Balancing
orchestration
05 Aug 2025
Local Interpretable Model-agnostic Explainations
explainability
05 Aug 2025
Local Outlier Factor (LOF)
clustering
05 Aug 2025
Log transformation
exploration
transformation
05 Aug 2025
Logical Model
documentation
database
05 Aug 2025
Logistic Regression Statsmodel Summary table
ml
regressor
05 Aug 2025
Logistic Regression does not predict probabilities
regressor
statistics
ml
05 Aug 2025
Logistic Regression
classifier
regressor
05 Aug 2025
Logistic regression in sklearn & Gradient Descent
ml
05 Aug 2025
Looker Studio
business
analysis
communication
05 Aug 2025
Loss versus Cost function
optimisation
05 Aug 2025
ML Engineer
career
05 Aug 2025
MNIST
exploration
05 Aug 2025
Machine Learning Algorithms
algorithm
modeling
05 Aug 2025
Machine Learning Operations
drafting
05 Aug 2025
Machine Learning
field
05 Aug 2025
Maintainable Code
devops
05 Aug 2025
Makefile
file_type
05 Aug 2025
Manifold learning
exploration
05 Aug 2025
Many-to-Many Relationships
database
data_structure
05 Aug 2025
Markov Decision Processes
modeling
05 Aug 2025
Markov chain
math
statistics
05 Aug 2025
Master Observability Datadog
governance
orchestration
05 Aug 2025
Mathematical Reasoning in Transformers
question
05 Aug 2025
Mathematics
math
portal
05 Aug 2025
Maximum Likelihood Estimation
statistics
modeling
05 Aug 2025
Mean Squared Error
statistics
05 Aug 2025
Memory Caching
system
05 Aug 2025
Memory
system
05 Aug 2025
Merge
transformation
05 Aug 2025
Mermaid
modeling
05 Aug 2025
Metadata Handling
explainability
05 Aug 2025
Learning Styles
architecture
05 Aug 2025
LightGBM vs XGBoost vs CatBoost
ml
05 Aug 2025
Kernel Density Estimation
statistics
ml
05 Aug 2025
Kernelling
ml
process
05 Aug 2025
Key Components of Attention and Formula
NLP
math
05 Aug 2025
Kmeans vs GMM
clustering
05 Aug 2025
Knowledge Graph
graph
NLP
05 Aug 2025
Knowledge Work
career
05 Aug 2025
Knowledge graph vs RAG setup
data_structure
GenAI
memory_management
05 Aug 2025
LBFGS
regressor
optimisation
05 Aug 2025
LLM Evaluation Metrics
evaluation
NLP
05 Aug 2025
LLM Memory
language_models
NLP
05 Aug 2025
LLM
language_models
05 Aug 2025
LSTM
deep_learning
time_series
05 Aug 2025
Label encoding vs One-hot encoding
preprocessing
05 Aug 2025
Label encoding
preprocessing
05 Aug 2025
Labelling data
process
05 Aug 2025
Langchain
GenAI
python
05 Aug 2025
Language Model Output Optimisation
language_models
optimisation
05 Aug 2025
Language Models Large (LLMs) vs Small (SLMs)
language_models
05 Aug 2025
Language Models
portal
05 Aug 2025
Lasso
drafting
05 Aug 2025
Latency
term
05 Aug 2025
Latent Dirichlet Allocation
explainability
NLP
05 Aug 2025
Learning Curve
evaluation
ml
05 Aug 2025
Imputation Techniques
cleaning
05 Aug 2025
In NER how would you handle ambiguous entities
NLP
05 Aug 2025
Indexing in cypher
optimisation
05 Aug 2025
Industries of interest
career
05 Aug 2025
Inertia K Means Cost Function
clustering
evaluation
05 Aug 2025
Input is Not Properly Sanitized
SQL
cleaning
05 Aug 2025
Interoperability
explainability
05 Aug 2025
Interpreting logistic regression model parameters
explainability
ml
05 Aug 2025
Interquartile Range (IQR) Detection
statistics
process
05 Aug 2025
Isolated Forest
anomaly_detection
data_quality
05 Aug 2025
Java vs JavaScript
software
05 Aug 2025
Java
programming
05 Aug 2025
JavaScript
programming
05 Aug 2025
Jobs to be done
business
communication
05 Aug 2025
Johnson–Lindenstrauss lemma
math
05 Aug 2025
Joining Datasets
transformation
05 Aug 2025
Json to SQLite
file_type
05 Aug 2025
Json
file_type
05 Aug 2025
Junction Tables
database
05 Aug 2025
Jupyter Book
communication
05 Aug 2025
Justfile
file_type
05 Aug 2025
K-means
clustering
05 Aug 2025
K-nearest neighbours
classifier
ml
05 Aug 2025
KNIME
exploration
transformation
05 Aug 2025
K_Means.py
blank
05 Aug 2025
Keras
python
deep_learning
05 Aug 2025
Hierarchical Clustering
clustering
05 Aug 2025
High cross validation accuracy is not directly proportional to performance on unseen test data
ml
explainability
05 Aug 2025
Honkit
tool
05 Aug 2025
Hosting
architecture
05 Aug 2025
How LLMs store facts
NLP
agents
05 Aug 2025
How businesses use Gen AI
business
GenAI
05 Aug 2025
How do we evaluate of LLM Outputs
evaluation
05 Aug 2025
How is reinforcement learning being combined with deep learning
deep_learning
05 Aug 2025
How is schema evolution done in practice with SQL
question
05 Aug 2025
How to do git commit messages properly
process
05 Aug 2025
How to normalise a merged table
transformation
05 Aug 2025
How to reduce the need for Gen AI responses
business
GenAI
05 Aug 2025
How to search within a graph
graph
querying
05 Aug 2025
How to use Sklearn Pipeline
question
python
05 Aug 2025
How would you decide between using TF-IDF and Word2Vec for text vectorization
NLP
05 Aug 2025
Hugging Face
GenAI
software
05 Aug 2025
Hyperparameter Tuning
optimisation
process
05 Aug 2025
Hyperparameter
modeling
optimisation
05 Aug 2025
Hypothesis testing
statistics
05 Aug 2025
Imbalanced Datasets
cleaning
data_quality
exploration
05 Aug 2025
Imbalanced_Datasets_SMOTE.py
blank
05 Aug 2025
Immutable vs mutable
python
data_structure
05 Aug 2025
Impact of multicollinearity on model parameters
modeling
evaluation
statistics
05 Aug 2025
Implementing Database Schema
database
data_structure
05 Aug 2025
Grep
system
05 Aug 2025
GridSeachCv
optimisation
05 Aug 2025
Groupby vs Crosstab
transformation
05 Aug 2025
Groupby
transformation
05 Aug 2025
Grouped plots
statistics
visualization
05 Aug 2025
Guardrails
business
GenAI
05 Aug 2025
Hadoop
software
05 Aug 2025
Handling Different Distributions
statistics
management
05 Aug 2025
Handling Missing Data
preprocessing
transformation
05 Aug 2025
Handling_Missing_Data.ipynb
blank
05 Aug 2025
Handling_Missing_Data_Basic.ipynb
blank
05 Aug 2025
Hash
data_structure
05 Aug 2025
Heap Data Structure
data_structure
05 Aug 2025
Heap Memory
memory_management
05 Aug 2025
Heatmap
code_snippet
visualization
05 Aug 2025
Heatmaps_Dendrograms.py
blank
05 Aug 2025
Gradient Boosting
optimisation
05 Aug 2025
Gradient Descent
optimisation
05 Aug 2025
Gradient descent in linear regression
optimisation
ml
05 Aug 2025
Gradio
cloud
software
05 Aug 2025
Grain
storage
database
05 Aug 2025
Grammar method
NLP
05 Aug 2025
Graph Neural Network
graph
05 Aug 2025
Graph Query Language
05 Aug 2025
Graph Theory Community
clustering
graph
05 Aug 2025
Graph Theory
graph
math
05 Aug 2025
GraphRAG
drafting
05 Aug 2025
Forward Propagation
deep_learning
statistics
05 Aug 2025
Fuzzywuzzy
NLP
05 Aug 2025
GIS
file_type
software
05 Aug 2025
GPT
agents
cloud
tool
05 Aug 2025
GRU
ml
deep_learning
05 Aug 2025
Gartner Hype Cycle
business
learning
05 Aug 2025
Gaussian Distribution
statistics
05 Aug 2025
Gaussian Mixture Models
clustering
05 Aug 2025
Gaussian Model
clustering
05 Aug 2025
General Linear Regression
regressor
05 Aug 2025
Generative AI From Theory to Practice
GenAI
05 Aug 2025
Generative AI
GenAI
05 Aug 2025
Generative Adversarial Networks
deep_learning
05 Aug 2025
Generators in Python
data_structure
python
05 Aug 2025
Gini Impurity vs Cross Entropy
evaluation
05 Aug 2025
Gini Impurity
statistics
ml
evaluation
05 Aug 2025
Git
software
05 Aug 2025
Gitlab
05 Aug 2025
Global Interpreter Lock
system
python
05 Aug 2025
Google Cloud Platform
cloud
05 Aug 2025
Google Colab
analysis
cloud
communication
05 Aug 2025
Google My Maps Data Extraction
devops
05 Aug 2025
Google OR Tools
math
optimisation
05 Aug 2025
Google Sheet Pivots Table
transformation
05 Aug 2025
Google Sheets
business
software
05 Aug 2025
Gradient Boosting Regressor
regressor
05 Aug 2025
Feature Selection
evaluation
explainability
modeling
process
selection
05 Aug 2025
Feature_Distribution.py
blank
05 Aug 2025
Feed Forward Neural Network
classifier
deep_learning
05 Aug 2025
Feedback Template
documentation
05 Aug 2025
File Management
management
05 Aug 2025
Filter method
explainability
statistics
05 Aug 2025
Firebase
frontend
devops
05 Aug 2025
Fishbone diagram
communication
documentation
05 Aug 2025
Fitting weights and biases of a neural network
deep_learning
ml
05 Aug 2025
Flask
python
05 Aug 2025
Folder Tree Diagram
system
05 Aug 2025
Forecasting_AutoArima.py
blank
05 Aug 2025
Forecasting_Baseline.py
blank
05 Aug 2025
Forecasting_Exponential_Smoothing.py
blank
05 Aug 2025
Foreign Key
database
05 Aug 2025
Fabric
cloud
05 Aug 2025
Factor Analysis
analysis
statistics
05 Aug 2025
Factor_Analysis.py
blank
05 Aug 2025
FastAPI
frontend
devops
05 Aug 2025
FastAPI_Example.py
blank
05 Aug 2025
Feature Engineering
optimisation
process
05 Aug 2025
Feature Evaluation
evaluation
exploration
05 Aug 2025
Feature Extraction
ml
preprocessing
transformation
05 Aug 2025
Feature Importance
evaluation
explainability
process
05 Aug 2025
Feature Scaling
cleaning
preprocessing
05 Aug 2025
Feature Selection vs Feature Importance
selection
exploration
05 Aug 2025
Encoding Categorical Variables
cleaning
preprocessing
regressor
05 Aug 2025
Energy ABM
modeling
05 Aug 2025
Energy Storage
energy
05 Aug 2025
Energy
energy
05 Aug 2025
Environment Variables
devops
system
05 Aug 2025
Epoch
deep_learning
ml
05 Aug 2025
Epub
file_type
05 Aug 2025
Estimator
storage
05 Aug 2025
Evaluate Embedding Methods
analysis
evaluation
nlp
05 Aug 2025
Evaluating Language Models
evaluation
language_models
05 Aug 2025
Evaluating the effectiveness of prompts
GenAI
question
05 Aug 2025
Evaluation Metrics
code_snippet
evaluation
05 Aug 2025
Event Driven Events
devops
05 Aug 2025
Event Driven Microservices
software
architecture
05 Aug 2025
Event Driven
devops
05 Aug 2025
Event-Driven Architecture
architecture
05 Aug 2025
Everything
software
05 Aug 2025
Excel pivot table
tool
transformation
05 Aug 2025
Excel vs Google Sheets
tool
05 Aug 2025
Excel
business
file_type
software
05 Aug 2025
Experiment Plan Template
documentation
05 Aug 2025
Exploration vs Exploitation
deep_learning
05 Aug 2025
F-statistic
05 Aug 2025
F1 Score
evaluation
05 Aug 2025
FAISS
NLP
python
05 Aug 2025
Directed Acyclic Graph (DAG)
math
orchestration
05 Aug 2025
Distillation
GenAI
ml
language_models
05 Aug 2025
Distributed Computing
cloud
management
05 Aug 2025
Distribution_Analysis.py
blank
05 Aug 2025
Distributions
statistics
05 Aug 2025
Docker Image
software
05 Aug 2025
Docker
software
orchestration
05 Aug 2025
Documentation & Meetings
business
communication
05 Aug 2025
Dropout
deep_learning
optimisation
05 Aug 2025
DuckDB in python
python
database
05 Aug 2025
DuckDB vs SQLite
python
database
05 Aug 2025
Dummy variable trap
ml
modeling
preprocessing
05 Aug 2025
EDA
analysis
exploration
transformation
05 Aug 2025
ELT
transformation
05 Aug 2025
ER Diagrams
database
visualization
05 Aug 2025
ETL Pipeline example
transformation
05 Aug 2025
ETL
transformation
05 Aug 2025
Edge ML
architecture
05 Aug 2025
Education and Training
deep_learning
05 Aug 2025
Elastic Net
code_snippet
05 Aug 2025
ElasticSearch
NLP
tool
05 Aug 2025
Embedded Methods
selection
05 Aug 2025
Debugging
exploration
05 Aug 2025
Debugging.py
blank
05 Aug 2025
Decision Tree
classifier
regressor
05 Aug 2025
Declarative Data Pipeline
orchestration
process
05 Aug 2025
Deep Learning Frameworks
deep_learning
python
05 Aug 2025
Deep Learning
deep_learning
05 Aug 2025
Deep Q-Learning
deep_learning
05 Aug 2025
Demand forecasting
energy
05 Aug 2025
Dendrograms
clustering
visualization
05 Aug 2025
Design Thinking Questions
business
05 Aug 2025
Determining Threshold Values
evaluation
selection
05 Aug 2025
DevOps
orchestration
05 Aug 2025
Differentation
math
05 Aug 2025
Digital Transformation
business
governance
05 Aug 2025
Digital twin
explainability
modeling
05 Aug 2025
Dimension Table
database
modeling
05 Aug 2025
Dimensional Modelling
database
modeling
05 Aug 2025
Dimensionality Reduction
process
visualization
05 Aug 2025
Database Management System (DBMS)
database
management
05 Aug 2025
Database Storage
cleaning
database
storage
05 Aug 2025
Database Techniques
database
portal
05 Aug 2025
Database schema
database
modeling
05 Aug 2025
Database
database
storage
05 Aug 2025
Databricks vs Snowflake
cloud
05 Aug 2025
Databricks
cloud
05 Aug 2025
Datasets
exploration
05 Aug 2025
Debugging ipynb
exploration
05 Aug 2025
Data Modelling
database
modeling
05 Aug 2025
Data Observability
orchestration
management
05 Aug 2025
Data Orchestration
orchestration
05 Aug 2025
Data Pipeline to Data Products
transformation
05 Aug 2025
Data Pipeline
management
process
05 Aug 2025
Data Principles
data_quality
governance
portal
05 Aug 2025
Data Reduction
management
preprocessing
05 Aug 2025
Data Roles
business
05 Aug 2025
Data Science
field
05 Aug 2025
Data Scientist
business
05 Aug 2025
Data Security
governance
security
05 Aug 2025
Data Selection in ML
selection
05 Aug 2025
Data Selection
selection
transformation
05 Aug 2025
Data Steward
business
05 Aug 2025
Data Streaming
orchestration
05 Aug 2025
Data Transformation with Pandas
transformation
05 Aug 2025
Data Transformation
cleaning
transformation
05 Aug 2025
Data Validation
governance
management
05 Aug 2025
Data Visualisation
analysis
05 Aug 2025
Data Warehouse
database
storage
05 Aug 2025
Data storage
database
storage
05 Aug 2025
Data transformation in Data Engineering
transformation
05 Aug 2025
Data transformation in Machine Learning
transformation
ml
05 Aug 2025
Database Index
database
optimisation
05 Aug 2025
DS & ML Portal
portal
05 Aug 2025
Dash
python
frontend
05 Aug 2025
Dashboarding
analysis
frontend
05 Aug 2025
Data AI Education at Work
business
05 Aug 2025
Data Analysis Portal
analysis
portal
05 Aug 2025
Data Analysis
analysis
05 Aug 2025
Data Analyst
analysis
05 Aug 2025
Data Architect
architecture
05 Aug 2025
Data Assessment
process
05 Aug 2025
Data Cleansing
cleaning
portal
transformation
05 Aug 2025
Data Collection
process
05 Aug 2025
Data Contract
business
05 Aug 2025
Data Distribution
analysis
communication
business
05 Aug 2025
Data Drift
modeling
statistics
05 Aug 2025
Data Engineer
career
field
05 Aug 2025
Data Engineering Portal
portal
05 Aug 2025
Data Engineering Tools
management
tool
05 Aug 2025
Data Engineering
field
05 Aug 2025
Data Ingestion
management
preprocessing
05 Aug 2025
Data Integrity
data_quality
management
05 Aug 2025
Data Lake
storage
05 Aug 2025
Data Lakehouse
storage
05 Aug 2025
Data Leakage
memory_management
storage
05 Aug 2025
Data Lifecycle Management
management
portal
05 Aug 2025
Data Management
data_quality
management
05 Aug 2025
Data Mining - CRISP
business
05 Aug 2025
Cross_Entropy_Single.py
blank
05 Aug 2025
Crosstab
transformation
05 Aug 2025
Cryptography
code_snippet
math
security
05 Aug 2025
Curse of dimensionality
cleaning
05 Aug 2025
Cypher
graph
querying
05 Aug 2025
DBScan
clustering
05 Aug 2025
Concurrency
system
05 Aug 2025
Confidence Interval
statistics
05 Aug 2025
Confusion Matrix
evaluation
05 Aug 2025
Continuous Delivery - Deployment
devops
05 Aug 2025
Continuous Integration
devops
05 Aug 2025
Convolutional Neural Networks
deep_learning
05 Aug 2025
Correlation vs Causation
statistics
analysis
05 Aug 2025
Correlation
statistics
05 Aug 2025
Cosine Similarity
math
NLP
05 Aug 2025
Cost Function
ml
optimisation
05 Aug 2025
Cost-Sensitive Analysis
evaluation
05 Aug 2025
Covariance Structures
05 Aug 2025
Covariance vs Correlation
statistics
05 Aug 2025
Covariance
analysis
statistics
05 Aug 2025
Covering Index
05 Aug 2025
Cron jobs
devops
05 Aug 2025
Cross Entropy
architecture
optimisation
05 Aug 2025
Cross validation
evaluation
05 Aug 2025
Cross_Entropy.py
blank
05 Aug 2025
Chi-Squared Test
statistics
05 Aug 2025
Choosing a Threshold
evaluation
ml
05 Aug 2025
Choosing the Number of Clusters
preprocessing
clustering
05 Aug 2025
Class Separability
data_quality
evaluation
05 Aug 2025
Classification Report
evaluation
05 Aug 2025
Classification
classifier
ml
05 Aug 2025
Claude
GenAI
05 Aug 2025
Click_Implementation.py
blank
05 Aug 2025
Cloud Providers
storage
05 Aug 2025
Cluster Density
clustering
05 Aug 2025
Cluster Seperation
clustering
05 Aug 2025
Clustering
clustering
05 Aug 2025
Clustering_Dashboard.py
code_snippet
05 Aug 2025
Clustermap
ml
preprocessing
05 Aug 2025
Code Diagrams
communication
documentation
05 Aug 2025
Columnar Storage
storage
05 Aug 2025
Command Prompt
system
05 Aug 2025
Command line
system
05 Aug 2025
Common Table Expression
database
querying
05 Aug 2025
Communication Techniques
business
communication
05 Aug 2025
Communication principles
communication
05 Aug 2025
Comparing LLMs
language_models
05 Aug 2025
Comparing_Ensembles.py
blank
05 Aug 2025
Components of the database
database
05 Aug 2025
Computer Science
field
05 Aug 2025
Conceptual Model
communication
05 Aug 2025
Big Data
orchestration
storage
05 Aug 2025
BigQuery
database
cloud
05 Aug 2025
Binder
communication
visualization
05 Aug 2025
Boosting
architecture
explainability
05 Aug 2025
Bootstrap Sampling
statistics
05 Aug 2025
Boxplot
statistics
cleaning
visualization
05 Aug 2025
Business observability
business
05 Aug 2025
Business value of anomaly detection
business
anomaly_detection
05 Aug 2025
CI-CD
devops
05 Aug 2025
CRUD
database
05 Aug 2025
CUDA
deep_learning
05 Aug 2025
Casual Inference
statistics
05 Aug 2025
CatBoost
ml
python
05 Aug 2025
Central Limit Theorem & Small Sample Sizes
statistics
05 Aug 2025
Central Limit Theorem
statistics
05 Aug 2025
Chain of thought
language_models
05 Aug 2025
Change Management
business
05 Aug 2025
ChatGPT
language_models
05 Aug 2025
Checksum
security
algorithm
05 Aug 2025
Bandit_Example_Fixed.py
blank
05 Aug 2025
Bash
system
05 Aug 2025
Batch Normalisation
ml
05 Aug 2025
Batch Processing
orchestration
process
system
05 Aug 2025
Bellman Equations
math
modeling
05 Aug 2025
Benefits of Data Transformation
transformation
05 Aug 2025
Bernoulli
statistics
05 Aug 2025
Bias and variance
architecture
explainability
05 Aug 2025
Alternatives to Batch Processing
orchestration
05 Aug 2025
Amazon S3
cloud
storage
05 Aug 2025
Anomaly Detection in Time Series
anomaly_detection
05 Aug 2025
Anomaly Detection with Clustering
anomaly_detection
clustering
05 Aug 2025
Anomaly Detection with Statistical Methods
anomaly_detection
ml
statistics
05 Aug 2025
Anomaly Detection
anomaly_detection
05 Aug 2025
Apache Airflow
orchestration
software
05 Aug 2025
Apache Iceberg
modeling
storage
05 Aug 2025
Apache Kafka
orchestration
software
05 Aug 2025
Apache Spark
software
05 Aug 2025
Asking questions
business
communication
GenAI
learning
NLP
05 Aug 2025
Assumption of Normality
statistics
05 Aug 2025
Attack mitigation
security
05 Aug 2025
Attack types
security
05 Aug 2025
Attention Is All You Need
ml
05 Aug 2025
Attention mechanism
language_models
05 Aug 2025
Automated Feature Creation
transformation
05 Aug 2025
Azure
cloud
storage
05 Aug 2025
BERT Pretraining of Deep Bidirectional Transformers for Language Understanding
language_models
05 Aug 2025
BERT
deep_learning
language_models
NLP
05 Aug 2025
BERTScore
language_models
05 Aug 2025
Backpropagation
deep_learning
optimisation
statistics
05 Aug 2025
Bag of words
NLP
05 Aug 2025
Bag_of_Words.py
blank
05 Aug 2025
Bagging
architecture
05 Aug 2025
Bandit example output
blank
05 Aug 2025
AI Engineer
ml
05 Aug 2025
AI governance
GenAI
governance
05 Aug 2025
ANOVA
05 Aug 2025
API Driven Microservices
business
software
05 Aug 2025
API
software
05 Aug 2025
ARIMA
modeling
05 Aug 2025
AUC
evaluation
05 Aug 2025
AWS Lambda
05 Aug 2025
Accessing Gen AI generated content
evaluation
GenAI
05 Aug 2025
Accuracy
evaluation
05 Aug 2025
Activation Function
deep_learning
05 Aug 2025
Activation atlases
visualization
05 Aug 2025
Active Learning
classifier
05 Aug 2025
Ada boosting
architecture
05 Aug 2025
Adam Optimizer
modeling
optimisation
05 Aug 2025
Adaptive Learning Rates
learning
modeling
05 Aug 2025
Adding a database to PostgreSQL
database
05 Aug 2025
Addressing Multicollinearity
statistics
05 Aug 2025
Addressing_Multicollinearity.py
blank
05 Aug 2025
Adjusted R squared
evaluation
statistics
05 Aug 2025
Agent Exploration
drafting
05 Aug 2025
Agent-based modelling
modeling
05 Aug 2025
Agentic Solutions
drafting
05 Aug 2025
Aggregation
analysis
transformation
05 Aug 2025
Algorithms
algorithm
05 Aug 2025
Altair
visualization
05 Aug 2025
1-on-1 Template
documentation
05 Aug 2025
1-to-1's with a Line Manager
business
career
communication
05 Aug 2025
AB testing
software
05 Aug 2025
ACID Transaction
database
storage
05 Aug 2025
AI Agents Memory
evaluation
NLP
optimisation
Explorer
pages
Data Archive
DE_Tools
ML_Tools
Quotes
Research Questions
Reviews
standardised
1-on-1 Template
1-to-1's with a Line Manager
AB testing
Accessing Gen AI generated content
Accuracy
ACID Transaction
Activation atlases
Activation Function
Active Learning
Ada boosting
Adam Optimizer
Adaptive Learning Rates
Adding a database to PostgreSQL
Addressing Multicollinearity
Addressing_Multicollinearity.py
Adjusted R squared
Agent Exploration
Agent-based modelling
Agentic Solutions
Aggregation
AI Agents Memory
AI Engineer
AI governance
Algorithms
Altair
altair versus seaborn
Alternatives to Batch Processing
Amazon S3
Anomaly Detection
Anomaly Detection in Time Series
Anomaly Detection with Clustering
Anomaly Detection with Statistical Methods
ANOVA
Apache Airflow
Apache Iceberg
Apache Kafka
Apache Spark
API
API Driven Microservices
ARIMA
Asking questions
Assumption of Normality
Attack mitigation
Attack types
Attention Is All You Need
Attention mechanism
AUC
Automated Feature Creation
AWS Lambda
Azure
Backpropagation
Bag of words
Bag_of_Words.py
Bagging
Bandit example output
Bandit_Example_Fixed.py
Bash
bat
Batch Normalisation
Batch Processing
Bellman Equations
Benefits of Data Transformation
Bernoulli
BERT
BERT Pretraining of Deep Bidirectional Transformers for Language Understanding
BERTScore
Bias and variance
Big Data
big o notation
BigQuery
binary classification
Binder
Boosting
Bootstrap Sampling
Boxplot
business intelligence
Business observability
Business value of anomaly detection
Casual Inference
CatBoost
Central Limit Theorem
Central Limit Theorem & Small Sample Sizes
Chain of thought
Change Management
ChatGPT
Checksum
Chi-Squared Test
Choosing a Threshold
Choosing the Number of Clusters
CI-CD
Class Separability
Classification
Classification Report
Claude
Click_Implementation.py
Cloud Providers
Cluster Density
Cluster Seperation
Clustering
Clustering_Dashboard.py
Clustermap
Code Diagrams
Columnar Storage
Command line
Command Prompt
Common Table Expression
Communication principles
Communication Techniques
Comparing LLMs
Comparing_Ensembles.py
Components of the database
Computer Science
conceptual data model
Conceptual Model
Concurrency
Confidence Interval
Confusion Matrix
Continuous Delivery - Deployment
Continuous Integration
Convolutional Neural Networks
Correlation
Correlation vs Causation
Cosine Similarity
Cost Function
Cost-Sensitive Analysis
Covariance
Covariance Structures
Covariance vs Correlation
Covering Index
Cron jobs
Cross Entropy
Cross validation
Cross_Entropy_Single.py
Cross_Entropy.py
Crosstab
CRUD
Cryptography
csv module
CUDA
Curse of dimensionality
Cypher
dagster
Dash
Dashboarding
Data AI Education at Work
Data Analysis
Data Analysis Portal
Data Analyst
Data Architect
Data Assessment
Data Cleansing
Data Collection
Data Contract
Data Distribution
Data Drift
Data Engineer
Data Engineering
Data Engineering Portal
Data Engineering Tools
data governance
data hierarchy of needs
Data Ingestion
data integration
Data Integrity
Data Lake
Data Lakehouse
Data Leakage
Data Lifecycle Management
data lineage
data literacy
Data Management
Data Mining - CRISP
Data Modelling
Data Observability
Data Orchestration
Data Pipeline
Data Pipeline to Data Products
Data Principles
data product
data quality
Data Reduction
Data Roles
Data Science
Data Scientist
Data Security
Data Selection
Data Selection in ML
Data Steward
Data storage
Data Streaming
Data Transformation
Data transformation in Data Engineering
Data transformation in Machine Learning
Data Transformation with Pandas
Data Validation
data virtualization
Data Visualisation
Data Warehouse
Database
Database Index
Database Management System (DBMS)
Database schema
Database Storage
Database Techniques
Databricks
Databricks vs Snowflake
Datasets
DBScan
dbt
Debugging
Debugging ipynb
Debugging.py
Decision Tree
Declarative Data Pipeline
Deep Learning
Deep Learning Frameworks
Deep Q-Learning
Demand forecasting
Dendrograms
dependency manager
Design Thinking Questions
Determining Threshold Values
DevOps
Differentation
Digital Transformation
Digital twin
Dimension Table
Dimensional Modelling
Dimensionality Reduction
dimensions
Directed Acyclic Graph (DAG)
Distillation
Distributed Computing
Distribution_Analysis.py
Distributions
Docker
Docker Image
documentation
Documentation & Meetings
Dropout
DS & ML Portal
duckdb
DuckDB in python
DuckDB vs SQLite
Dummy variable trap
EDA
Edge ML
Education and Training
Elastic Net
ElasticSearch
ELT
Embedded Methods
embeddings for OOV words
emergent behavior
Encoding Categorical Variables
Energy
Energy ABM
Energy Storage
Environment Variables
Epoch
Epub
ER Diagrams
Estimator
ETL
ETL Pipeline example
etl vs elt
etlt
Evaluate Embedding Methods
Evaluating Language Models
Evaluating the effectiveness of prompts
Evaluation Metrics
Event Driven
Event Driven Events
Event Driven Microservices
Event-Driven Architecture
Everything
Excel
Excel pivot table
Excel vs Google Sheets
Experiment Plan Template
Exploration vs Exploitation
f-regression
F-statistic
F1 Score
Fabric
fact table
Factor Analysis
Factor_Analysis.py
facts
FAISS
FastAPI
FastAPI_Example.py
Feature Engineering
Feature Evaluation
Feature Extraction
Feature Importance
Feature Scaling
Feature Selection
Feature Selection vs Feature Importance
Feature_Distribution.py
Feed Forward Neural Network
Feedback Template
File Management
Filter method
filter methods
Firebase
Fishbone diagram
Fitting weights and biases of a neural network
Flask
Folder Tree Diagram
Forecasting_AutoArima.py
Forecasting_Baseline.py
Forecasting_Exponential_Smoothing.py
Foreign Key
Forward Propagation
frontend
functional programming
Fuzzywuzzy
garbage collector
Gartner Hype Cycle
Gaussian Distribution
Gaussian Mixture Models
Gaussian Model
gaussian_mixture_model_implementation.py
General Linear Regression
Generative Adversarial Networks
Generative AI
Generative AI From Theory to Practice
Generators in Python
Gini Impurity
Gini Impurity vs Cross Entropy
GIS
Git
Gitlab
gitlab-ci.yml
Global Interpreter Lock
Google Cloud Platform
Google Colab
Google My Maps Data Extraction
Google OR Tools
Google Sheet Pivots Table
Google Sheets
GPT
Gradient Boosting
Gradient Boosting Regressor
Gradient Descent
Gradient descent in linear regression
Gradio
Grain
Grammar method
granularity
Graph Neural Network
Graph Query Language
Graph Theory
Graph Theory Community
GraphRAG
Grep
GridSeachCv
Groupby
Groupby vs Crosstab
Grouped plots
GRU
Guardrails
Hadoop
Handling Different Distributions
Handling Missing Data
Handling_Missing_Data_Basic.ipynb
Handling_Missing_Data.ipynb
Hash
Heap Data Structure
Heap Memory
Heatmap
Heatmaps_Dendrograms.py
heterogeneous features
Hierarchical Clustering
High cross validation accuracy is not directly proportional to performance on unseen test data
Honkit
Hosting
How businesses use Gen AI
How do we evaluate of LLM Outputs
how do you do the data selection
How is reinforcement learning being combined with deep learning
How is schema evolution done in practice with SQL
How LLMs store facts
How to do git commit messages properly
How to normalise a merged table
How to reduce the need for Gen AI responses
How to search within a graph
How to use Sklearn Pipeline
How would you decide between using TF-IDF and Word2Vec for text vectorization
html
Hugging Face
Hyperparameter
Hyperparameter Tuning
Hypothesis testing
Imbalanced Datasets
Imbalanced_Datasets_SMOTE.py
Immutable vs mutable
Impact of multicollinearity on model parameters
imperative
Implementing Database Schema
Imputation Techniques
In NER how would you handle ambiguous entities
in-memory format
incremental synchronization
Indexing in cypher
Industries of interest
Inertia K Means Cost Function
inference
inference versus prediction
information theory
initialization methods
Input is Not Properly Sanitized
Interoperability
interoperable
interpretability
Interpreting logistic regression model parameters
Interquartile Range (IQR) Detection
ipynb
Isolated Forest
Java
Java vs JavaScript
JavaScript
jinja template
Jobs to be done
Johnson–Lindenstrauss lemma
Joining Datasets
Json
Json to SQLite
Junction Tables
Jupyter Book
jupytext
Justfile
K_Means.py
K-means
K-nearest neighbours
Keras
Kernel Density Estimation
Kernelling
Key Components of Attention and Formula
Kmeans vs GMM
KNIME
Knowledge Graph
Knowledge graph vs RAG setup
Knowledge Work
kubernetes
Label encoding
Label encoding vs One-hot encoding
Labelling data
lambda architecture
Langchain
Language Model Output Optimisation
Language Models
Language Models Large (LLMs) vs Small (SLMs)
Lasso
Latency
Latent Dirichlet Allocation
LBFGS
Learning Curve
learning rate
Learning Styles
lemmatization
LightGBM
LightGBM vs XGBoost vs CatBoost
Linear Discriminant Analysis
Linear Regression
Linked List
LLM
LLM Evaluation Metrics
LLM Memory
Load Balancing
Local Interpretable Model-agnostic Explainations
Local Outlier Factor (LOF)
Log transformation
Logical Model
Logistic Regression
Logistic Regression does not predict probabilities
Logistic regression in sklearn & Gradient Descent
Logistic Regression Statsmodel Summary table
Looker Studio
loss function
Loss versus Cost function
LSTM
Machine Learning
Machine Learning Algorithms
Machine Learning Operations
maintainability
Maintainable Code
Makefile
Manifold learning
Many-to-Many Relationships
map reduce
Markov chain
Markov Decision Processes
master data management
Master Observability Datadog
Mathematical Reasoning in Transformers
Mathematics
Maximum Likelihood Estimation
mean absolute error
Mean Squared Error
mean vs median
melt
Memory
Memory Caching
Merge
Mermaid
Metadata Handling
Methods for Handling Outliers
metric
Microsoft
Microsoft Access
Mini-batch gradient descent
Mixture of Experts
ML Engineer
MNIST
Model Building
Model Cascading
Model Deployment
Model Ensemble
Model Evaluation
Model Evaluation vs Model Optimisation
Model Interpretability
Model Observability
Model Optimisation
Model Parameters
Model Parameters Tuning
Model parameters vs hyperparameters
Model Selection
Model Validation
model-agnostic feature importance
Momentum
Momentum.py
MongoDB
Monolith Architecture
Monte Carlo Simulation
Multi-Agent Reinforcement Learning
Multi-head attention
Multi-level index
Multicollinearity
Multinomial Naive bayes
Multiprocessing
Multiprocessing vs Multithreading
Multithreading
MySql
Naive Bayes
Named Entity Recognition
nbconvert
nbconvert slideshows
neo4j
neomodel
NET
Network Design
Neural network
Neural Network Classification
Neural network in Practice
Neural Scaling Laws
Ngrams
NLP
nltk
Node.JS
non-parametric
Non-parametric tests
Normalisation
Normalisation of data
Normalisation of Text
Normalisation vs Standardisation
Normalised Schema
NoSQL
NotebookLM
npy Files A NumPy Array storage
Numpy
Object Relational Mapper
Odds
Odds vs Probability
OLAP (online analytical processing)
OLTP
One Pager Template
One_hot_encoding.py
One-hot encoding
OOV words
Operational Resilience for Growth and Adaptability
Optimisation function
Optimisation techniques
Optimising a Logistic Regression Model
Optimising Neural Networks
Optuna
Ordinary Least Squares
Orthogonalization
Outliers
Over parameterised models
Overfitting
p values
Page Rank
Pandas
Pandas Dataframe Agent
Pandas join vs merge
Pandas Pivot Table
Pandas Stack
Pandas_Common.py
Pandas_Stack.py
Pandoc
Parametric tests
parametric vs non-parametric models
parametric vs non-parametric tests
Parquet
parsimonious
Part of speech tagging
PCA Explained Variance Ratio
PCA Principal Components
PCA_Analysis.ipynb
PCA_Based_Anomaly_Detection.py
PCA-Based Anomaly Detection
pd.Grouper
pdoc
PDP and ICE
Percentile Detection
Performance Dimensions
Performance Drift
Physical Model
Pickle
Plotly
pmdarima
Poetry
Policy
Polynomial Regression
Positional Encoding
PostgreSQL
Postman
PowerBI
Powerquery
PowerShell
Powershell scripts
Powershell versus Command Prompt
Powershell vs Bash
Precision
Precision or Recall
Precision-Recall Curve
Prediction Intervals
Preprocessing
Prevention Is Better Than the Cure
Primary Key
Principal Component Analysis
Probability
Problem Definition
Process Based Parallelism
Processes vs Threads
programming languages
Project Management Portal
Prompt engineering
prompt retrievers
Prompts
Proportion Test
Publish and Subscribe
Pull Request Template
push-down
PyCaret
Pycaret_Anomaly.ipynb
Pycaret_Example.py
Pydantic
Pydantic_More.py
Pydantic.py
PyGraphviz
PyOD
Pyright
Pyright vs Pydantic
PySpark
Pytest
Python
Python Click
PyTorch
Pytorch vs Tensorflow
Q-Learning
Q-Q Plot
Quartz
Query Optimisation
Querying
QuickSort
R
R squared
R-squared metric not always a good indicator of model performance in regression
Race Conditions
RAG
Random Access Memory
Random Forest Regression
Random Forests
React
Reasoning tokens
Recall
Recommender systems
Recurrent Neural Networks
Recursive Algorithm
Registering a Scheduled Task
Regression
Regression metrics
Regression_Logistic_Metrics.ipynb
Regularisation
Regularisation of Tree based models
Regularisation.py
Reinforcement learning
Relating Tables Together
Relational Database
Relationships in memory
Relu
REST API
Reveal.js
reverse etl
Reward Function
Ridge
ROC (Receiver Operating Characteristic)
ROC_Curve.py
rollup
Root Mean Squared Error
Row-based Storage
Sampling
Sarsa
Scala
Scalability
Scaling Agentic Systems
Scaling Data Science Capability
Scaling Server
Scatter Plots
schema evolution
Scientific Method
Scikit-Learn
Scipy
Seaborn
search
Security mitigation
Security Researcher
Security Vulnerabilities
Self Attention
Self attention vs multi-head attention
Self-Attention
semantic layer
Semantic Relationships
Semantic search
semi-structured data
Sentence Similarity
Sentence Transformer Workflow
Sentence Transformers
shapefile
SHapley Additive exPlanations
Sharepoint
Silhouette Analysis
Similarity Search
Single source of truth
sklearn datasets
Sklearn Pipiline
Slowly Changing Dimension
Small Language Models
Smart Grids
SMOTE (Synthetic Minority Over-sampling Technique)
SMSS
Snowflake
Snowflake Schema
Snowflake vs Hadoop
Soft Deletion
Software Design Patterns
Software Development Life Cycle
Software Development Portal
spaCy
SparseCategorialCrossentropy or CategoricalCrossEntropy
Spearman vs Pearson Correlation
Specificity
Spreadsheets vs Databases
SQL
SQL Groupby
SQL Injection
SQL Joins
SQL vs NoSQL
SQL Window functions
SQLAlchemy
SQLAlchemy vs. sqlite3
SQLite
SQLite Studio
stack memory
Stacking
Standard deviation
Standardisation
Star Schema
Statistical Assumptions
Statistical Tests
Statistical theorems
Statistics
Stemming
Stochastic Gradient Descent
storage layer object store
Stored Procedures
Streamlit
Strongly vs Weakly typed language
structured data
Structuring and organizing data
Summarisation
Supervised Learning
Support Vector Classifier
Support Vector Machines
Support Vector Regression
SVM_Example.py
Symbolic computation
Sympy
syntactic relationships
t-SNE
T-test
Tableau
Technical Debt
Technical Design Doc Template
Telecommunications
Tensorflow
Terminal commands
Test Loss When Evaluating Models
Testing
Testing_Pytest.py
Testing_unittest.py
Text2Cypher
TF-IDF
TF-IDF Implementation
Thinking Systems
Time Series
Time Series Forecasting
Time Series Identify Trends and Patterns
Tokenisation
TOML
tool.bandit
tool.ruff
tool.uv
topic modeling
Train-Dev-Test Sets
Transaction
Transfer Learning
transfer_learning.py
Transformed Target Regressor
Transformer
Transformers vs RNNs
TS_Anomaly_Detection.py
Turning a flat file into a database
Type I Error (False Positive)
Type II Error (False Negative)
Types of Computational Bugs
Types of Database Schema
Types of Neural Networks
TypeScript
Typical Output Formats in Neural Networks
Ubuntu
UMAP
UML
unittest
univariate vs multivariate
Unix
unstructured data
Unsupervised learning
Use Cases for a Simple Neural Network Like
Use of RNNs in energy sector
Vacuum
vanishing and exploding gradients problem
Variability in linear models
variance
Vector Database
Vector Embedding
Vector_Embedding.py
Vectorisation
Vectorized Engine
Vercel
View Use Case
Views
Violin plot
Virtual environments
WCSS and elbow method
Weak Learners
Web Feature Server (WFS)
Web Map Tile Service (WMTS)
When and why not to us regularisation
Why does increasing the number of models in a ensemble not necessarily improve the accuracy
Why does the Adam Optimizer converge
Why is named entity recognition (NER) a challenging task
Why JSON is Better than Pickle for Untrusted Data
Why Removing Outliers May Improve Regression but Harm Classification
Why standardise features
Why Type 1 and Type 2 matter
Why use ER diagrams
Wikipedia_API.py
Windows Scheduled Tasks
Windows Subsystem for Linux
Word2vec
Word2Vec.py
WordNet
Wrapper Methods
Xaiver
XGBoost
yaml
Z-Normalisation
Z-Score
Z-Scores vs Prediction Intervals
Z-Test
Backlinks
No backlinks found