Data Archive

    • categories
      • computer-science
        • Algorithms
        • Big O Notation
        • BM25 (Best Match 25)
        • Checksum
        • Computer Science
        • Concurrency
        • Convex Optimisation
        • csv module
        • Directed Acyclic Graph (DAG)
        • Flask
        • garbage collector
        • Generators in Python
        • Hash
        • Heap Data Structure
        • Heap Memory
        • How to search within a graph
        • Immutable vs mutable
        • Java
        • Java vs JavaScript
        • JavaScript
        • Knowledge Graph
        • Langchain
        • Machine Learning Algorithms
        • Monte Carlo Simulation
        • Multiprocessing vs Multithreading
        • Multithreading
        • neomodel
        • Numpy
        • Processes vs Threads
        • programming languages
        • PyGraphviz
        • QuickSort
        • Ranking models
        • Recursive Algorithm
        • Science Portal
        • Strongly vs Weakly typed language
        • Times Series Python Packages
      • data-analysis
        • Altair
        • altair versus seaborn
        • Boxplot
        • Dash
        • Dashboarding
        • Dashboards
        • Data Analysis
        • Data Analysis Portal
        • Data Analyst
        • Data Distribution
        • Data Mining
        • Data Product
        • Data Reduction
        • Data Visualisation
        • DuckDB
        • EDA
        • ER Diagrams
        • Heatmap
        • Label encoding
        • Linear Discriminant Analysis
        • Log transformation
        • Looker Studio
        • MariaDB vs MySQL
        • Melt
        • Multiple Correspondence Analysis
        • Multivariate Analysis
        • OLAP
        • Page Rank
        • Parquet
        • Plotly
        • PowerBI
        • Preprocessing
        • Preprocessing Text Classification
        • Seaborn
        • SQL Window functions
        • t-SNE
        • Tableau
      • data-engineering
        • ACID Transaction
        • Ada boosting
        • Adding a database to PostgreSQL
        • Aggregation
        • Apache Iceberg
        • Attack mitigation
        • Attack types
        • AWS Lambda
        • Azure
        • Benefits of Data Transformation
        • Big Data
        • BigQuery
        • Cassandra
        • Cloud Providers
        • Coaching & Mentoring
        • Columnar Storage
        • Command Prompt
        • Common Table Expression
        • Components of the database
        • Covering Index
        • Crosstab
        • CRUD
        • CUDA
        • Curse of dimensionality
        • Cypher
        • Data Architect
        • Data Architecture
        • Data Cleansing
        • Data Contract
        • Data Deployment
        • Data Dictionary
        • Data Drift
        • Data Engineering
        • Data Engineering Portal
        • Data Engineering Tools
        • Data Evaluation
        • Data Hierarchy of Needs
        • Data Integration
        • Data Integrity
        • Data Lake
        • Data Lakehouse
        • Data Leakage
        • Data Lifecycle Management
        • data lineage
        • Data Management
        • Data Modeling
        • Data Observability
        • Data Principles
        • Data Quality
        • Data Security
        • Data Selection
        • Data Sources
        • Data Storage
        • Data Transformation
        • Data Transformation in Data Engineering
        • Data Transformation with Pandas
        • Data Validation
        • Data Virtualization
        • Data Warehouse
        • Database
        • Database Index
        • Database Management System (DBMS)
        • Database Schema
        • Database Storage
        • Database Techniques
        • DataOps
        • design pattern
        • Digital twin
        • Distributed Computing
        • DuckDB in python
        • DuckDB vs SQLite
        • Durability
        • ELT
        • Estimator
        • ETL
        • ETL Pipeline Example
        • ETL vs ELT
        • EtLT
        • Event Driven Microservices
        • Event-Driven Architecture
        • Fabric
        • Faker
        • File Management
        • Folder Tree Diagram
        • Foreign Key
        • Github Actions
        • Google Sheet Pivots Table
        • Grain
        • Graph Query Language
        • Groupby
        • Groupby vs Crosstab
        • heterogeneous features
        • Honkit
        • Hosting
        • How is schema evolution done in practice with SQL
        • How to normalise a merged table
        • Implementing Database Schema
        • Imputation Techniques
        • in-memory format
        • incremental synchronization
        • Indexing in cypher
        • Inner Join Example
        • Input is Not Properly Sanitized
        • Investigate pyodbc
        • Joining Datasets
        • Junction Tables
        • KNIME
        • Logical Model
        • Many-to-Many Relationships
        • map reduce
        • MariaDB
        • master data management
        • Merge
        • Microsoft Access
        • Missing Data
        • Model Deployment
        • Monolith Architecture
        • Multi-level index
        • Multiprocessing
        • MySql
        • neo4j
        • Normalised Schema
        • NoSQL
        • Object Relational Mapper
        • OLTP
        • Overfitting
        • Pandas
        • Pandas join vs merge
        • Pandas Pivot Table
        • Pandas Stack
        • pd.Grouper
        • pgAdmin
        • Pgadmin Permissions on Windows
        • Physical Model
        • Pickle
        • Poetry
        • Polars
        • PostgreSQL
        • Postman
        • PowerShell
        • Prevention Is Better Than The Cure
        • Primary Key
        • Push-Down
        • Pydantic
        • Pyright vs Pydantic
        • Query Optimisation
        • Querying
        • Querying Time Series
        • Race Conditions
        • Relating Tables Together
        • Relational Database
        • reverse etl
        • rollup
        • Row parameters in SQL
        • Row-based Storage
        • Scalability
        • Scaling Server
        • Schema Evolution
        • Search
        • Security mitigation
        • Security Researcher
        • semantic layer
        • Single Source of Truth
        • Sklearn Pipiline
        • Slowly Changing Dimension
        • SMSS
        • Snowflake Schema
        • Soft Deletion
        • Software Design Patterns
        • Spreadsheets vs Databases
        • SQL
        • SQL Groupby
        • SQL Injection
        • SQL Joins
        • SQLAlchemy
        • SQLAlchemy vs. sqlite3
        • SQLite
        • SQLite Studio
        • Star Schema
        • storage layer object store
        • Stored Procedures
        • structured data
        • Structuring and organizing data
        • textual user interface
        • Transaction
        • Trigger
        • Turning a flat file into a database
        • Types of Database Schema
        • Unix
        • unstructured data
        • Usability
        • Vacuum
        • Vector Database
        • Vectorized Engine
        • View Use Case
        • Views
        • Why Use PySpark in Databricks
        • Windows Subsystem for Linux
      • data-science
        • ACF Plots
        • Additive vs Multiplicative Models Time Series
        • ADF Test
        • Agent Exploration
        • Agentic Solutions
        • AI
        • ARIMA
        • ARIMA vs Random Forest in Time Series
        • Autocorrelation
        • Autocorrelation vs Autoregression
        • Autoregression
        • Baseline Forecast
        • Basics of Time Series
        • Batch gradient descent
        • Bellman Equations
        • Bias-Variance Trade Off
        • Capability
        • Choosing a Threshold
        • Choosing the Number of Clusters
        • Clustermap
        • Correlated Time Series
        • Covariance Structures
        • Cross Validation
        • Data Assessment
        • Data Collection
        • Data Mining - CRISP
        • Data Preparation
        • Data Science
        • Data science modeling tasks
        • Data Scientist
        • Data Understanding
        • Datasets
        • Decomposition in Time Series
        • Differencing in Time Series
        • DS & ML Portal
        • Dynamic Time Warping
        • Evaluating Time Series Forecasts
        • Evolving Seasonality
        • F-statistic
        • Feature Engineering
        • Feature Scaling
        • Feature Selection vs Feature Importance
        • Forecasting using Lags
        • Forecasting with Autoregressive (AR) Models
        • Forward Propagation
        • Gaussian Mixture Models
        • Gitlab
        • Gompertz Model
        • Good Enough Principle in Data Projects
        • Granger Causality Test
        • GraphRAG
        • Handling Missing Data
        • Holt-Winters (Exponential Smoothing)
        • Holt-Winters vs ARIMA
        • Holt’s Linear Trend Model (Double Exponential Smoothing)
        • how do you do the data selection
        • Imbalanced Datasets
        • Interpolation
        • Intervention Analysis
        • Joining Time Series
        • Kernel Machines
        • KPSS Test
        • Latency
        • Logistic Model Curve
        • LSTM in Time Series
        • Mean Absolute Percentage Error
        • MNIST
        • Model Evaluation by Prediction Difference
        • Normalisation
        • Out-of-sample rolling forecast evaluation
        • PACF Plots
        • Performance Dimensions
        • pmdarima
        • Properties of Time Series Models
        • Prophet
        • Random Forest Regression
        • Residuals Analysis
        • Rolling Mean vs Cumulative Mean
        • Scatter Plots
        • Scientific Method
        • Scipy
        • Seasonal Naive Forecast
        • Seasonality in Time Series
        • SHapley Additive exPlanations
        • Shot Learning
        • Silhouette Analysis
        • Simple Exponential Smoothing (SES)
        • sklearn datasets
        • SMOTE (Synthetic Minority Over-sampling Technique)
        • SparseCategorialCrossentropy or CategoricalCrossEntropy
        • stack memory
        • Stacking
        • Stationary Time Series
        • STL Decomposition
        • Time sampling
        • Time Series
        • Time Series Forecasting
        • Time Series Forecasts in Business
        • Time Series Learning Resources
        • Time Series Shapelet
        • Time Series Shocks
        • Trends in Time Series
        • Validation
        • Varmax
      • deep-learning
        • Convolutional Neural Networks
        • Deep Learning
        • How is reinforcement learning being combined with deep learning
        • LSTM
        • Multi-Agent Reinforcement Learning
        • Policy
        • Relu
        • Sarsa
      • devops
        • AB testing
        • Alternatives to Batch Processing
        • Amazon S3
        • Apache Airflow
        • Apache Kafka
        • Apache Spark
        • API
        • API Driven Microservices
        • appscript
        • Bash
        • bat
        • Batch Processing
        • Batch vs PowerShell scripts
        • Binder
        • Catalogs, Schemas, and Tables in Databricks
        • CI-CD
        • Click
        • Clustering_Dashboard.py
        • Code Diagrams
        • Code Prompting with AI
        • Command Line
        • Continuous Delivery - Deployment
        • Continuous Integration
        • Cron jobs
        • dagster
        • Data Ingestion
        • Data Maturity
        • Data Orchestration
        • Data Pipeline
        • Data Pipeline to Data Products
        • Data Streaming
        • Databricks
        • Databricks & dbt
        • Databricks Features
        • Databricks vs Snowflake
        • dbt
        • Debugging
        • Declarative Data Pipeline
        • Delta Tables in Databricks
        • dependency manager
        • DevOps
        • Devops Portal
        • Digital Transformation
        • Docker
        • Docker Image
        • Elastic Net
        • Environment Variables
        • Epub
        • Event Driven
        • Event Driven Events
        • Everything
        • Excel
        • Excel pivot table
        • Excel vs Google Sheets
        • FastAPI
        • Firebase
        • frontend
        • functional programming
        • GIS
        • Git
        • Github Actions
        • Github Gists
        • gitlab-ci.yml
        • Global Interpreter Lock
        • Google Cloud Platform
        • Google Colab
        • Google My Maps Data Extraction
        • Google Sheets
        • GPT
        • Gradio
        • Grep
        • Hadoop
        • Hugging Face
        • imperative
        • ipynb
        • jinja template
        • Json
        • Json to SQLite
        • jupytext
        • Justfile
        • kubernetes
        • Load Balancing
        • Loading Google Sheets into Databricks
        • Maintainability
        • Maintainable Code
        • Makefile
        • Master Observability Datadog
        • Memory
        • Memory Caching
        • Microsoft
        • MongoDB
        • nbconvert
        • NET
        • Node.js
        • Normalisation of Text
        • npm
        • Overwriting and Refreshing Tables in Databricks
        • Pandas Series vs DataFrame
        • Pandoc
        • PMML
        • Powerquery
        • Powershell scripts
        • Powershell versus Command Prompt
        • Powershell vs Bash
        • Publish and Subscribe
        • PySpark
        • Pytest
        • Python
        • Quartz
        • Random Access Memory
        • React
        • Registering a Scheduled Task
        • renv
        • REST API
        • Scala
        • Security Vulnerabilities
        • shapefile
        • Sharepoint
        • Shiny (R)
        • Snowflake
        • Snowflake vs Hadoop
        • Software Development Life Cycle
        • Spark DataFrames in Databricks
        • SQL vs NoSQL
        • Streamlit
        • Streamlit 1
        • Streamlit vs Dash
        • Technical Design Doc Template
        • Terminal commands
        • Testing
        • tidyverse
        • TOML
        • tool.bandit
        • tool.ruff
        • tool.uv
        • Types of Computational Bugs
        • TypeScript
        • Ubuntu
        • unittest
        • Using requirements or env.yml
        • uv
        • Vercel
        • Virtual environments
        • Web Feature Server (WFS)
        • Web Map Tile Service (WMTS)
        • Why JSON is Better than Pickle for Untrusted Data
        • Windows
        • Windows Scheduled Tasks
        • yaml
      • industry
        • AI Engineer
        • AI governance
        • Analytics Engineer
        • Ancillary Services
        • Balancing Mechanism
        • Bids and Offers (Balancing Mechanism)
        • business intelligence
        • Business observability
        • Business Understanding
        • Business Values
        • Data AI Education at Work
        • Data Engineer
        • Data Governance
        • Data in Energy Flexibility
        • data literacy
        • Data Roles
        • Data Steward
        • Demand Response
        • Demand Response Forecasting
        • Demand Response Types
        • Design Thinking Questions
        • Documentation & Meetings
        • Energy
        • Energy ABM
        • Energy Demand
        • Energy Flexibility
        • Energy Grid
        • Energy Storage
        • Energy Suppliers
        • Facts
        • FlexGO
        • Flexibility Markets
        • Flexitricity
        • Flexitricity & Elexon
        • Gartner Hype Cycle
        • Imbalance Pricing
        • Industries of interest
        • Knowledge Work
        • Managing People
        • ML Engineer
        • ML in Flexible Energy Systems
        • Network Design
        • Operational Resilience for Growth and Adaptability
        • Reporting
        • Scaling Data Science Capability
        • Smart Grids
        • Telecommunications
        • Thinking Systems
        • Use of RNNs in energy sector
        • Weather derivatives
        • Working with SMEs
      • machine-learning
        • Accuracy
        • Activation atlases
        • Activation Function
        • Active Learning
        • Adam Optimizer
        • Adaptive Learning Rates
        • Adjusted R squared
        • Agent-Based Modelling
        • AIC in Model Evaluation
        • Anomaly Detection
        • Anomaly Detection in Time Series
        • Anomaly Detection with Clustering
        • Anomaly Detection with Statistical Methods
        • Assessing Gen AI generated content
        • AUC
        • Automated Feature Creation
        • AutoML
        • Backpropagation
        • Bagging
        • Bagging Classifier vs Random Forest Classifier
        • Bagging vs Boosting
        • Batch Normalisation
        • Bias in ML
        • Binary Classification
        • Boosting
        • Business value of anomaly detection
        • CART
        • CatBoost
        • Challenges to Model Deployment
        • Class Separability
        • Classification
        • Classification Report
        • Cluster Density
        • Cluster Seperation
        • Clustering
        • Collaborative Filtering
        • conceptual data model
        • Confusion Matrix
        • Cost Function
        • Cost-Sensitive Analysis
        • Cross Entropy
        • Customer Growth Modeling
        • Data Selection in ML
        • Data Transformation in Machine Learning
        • DBSCAN
        • Decision Theory
        • Decision Tree
        • Decision Trees are Fragile
        • Deep Learning Frameworks
        • Deep Q-Learning
        • Dendrograms
        • Determining Threshold Values
        • Dimension Table
        • Dimensional Modelling
        • Dimensionality Reduction
        • Dimensions
        • Distributions in Decision Tree Leaves
        • Dropout
        • Dummy variable trap
        • Edge ML
        • emergent behavior
        • Encoding Categorical Variables
        • Epoch
        • Evaluating Language Models
        • Evaluating Logistic Regression
        • Evaluating the effectiveness of prompts
        • Evaluation Metrics
        • Exploration vs Exploitation
        • Exponential Smoothing
        • f-regression
        • F1 Score
        • Fact Table
        • FAISS
        • Feature Engineering for Time Series
        • Feature Evaluation
        • Feature Extraction
        • Feature Importance
        • Feature Selection
        • Feature Transformations
        • Feed Forward Neural Network
        • Filter Methods
        • Fitting weights and biases of a neural network
        • Framework for models
        • Gaussian Model
        • General Linear Regression
        • Generalisation
        • Generative Adversarial Networks
        • Gini Impurity
        • Gini Impurity vs Cross Entropy
        • Gradient Boosted Trees
        • Gradient Boosting
        • Gradient Boosting Regressor
        • Gradient Descent
        • Gradient descent in linear regression
        • granularity
        • Graph Neural Network
        • Graph Theory Community
        • GridSeachCv
        • Growth Models in Time Series
        • GRU
        • Hierarchical Clustering
        • High cross validation accuracy is not directly proportional to performance on unseen test data
        • Histogram
        • How do we evaluate of LLM Outputs
        • How to use Sklearn Pipeline
        • Hyperparameter
        • Hyperparameter Tuning
        • ICE Plot
        • Impact of multicollinearity on model parameters
        • Inertia K Means Cost Function
        • inference
        • inference versus prediction
        • initialization methods
        • Interoperability
        • interoperable
        • Interpretability
        • Interpreting logistic regression model parameters
        • Isolated Forest
        • Jaccard Coefficient
        • K-means
        • K-nearest neighbours
        • Keras
        • Kernel Density Estimation
        • Kernelling
        • Kmeans vs GMM
        • L1 Regularisation
        • Label encoding vs One-hot encoding
        • Labelling data
        • Lagrange multipliers in optimisation
        • lambda architecture
        • Latent Dirichlet Allocation
        • Latent Semantic Indexing
        • LBFGS
        • Learning Curve
        • Learning Rate
        • Learning Styles
        • LightGBM
        • LightGBM vs XGBoost vs CatBoost
        • Linear Regression
        • LLM Evaluation Metrics
        • Local Interpretable Model-agnostic Explainations
        • Local Outlier Factor (LOF)
        • Logistic Regression
        • Logistic Regression does not predict probabilities
        • Logistic regression in sklearn & Gradient Descent
        • Logistic Regression Statsmodel Summary table
        • Loss function
        • Loss versus Cost function
        • Machine Learning
        • Machine Learning Operations
        • Manifold Learning
        • Markov Decision Processes
        • Maximum Likelihood Estimation
        • Median Absolute Error
        • Mermaid
        • Metadata Handling
        • Methods for Handling Outliers
        • Metric
        • Mini-batch gradient descent
        • Model Building
        • Model Deployment using PyCaret
        • Model Ensemble
        • Model Evaluation
        • Model Evaluation vs Model Optimisation
        • Model Interpretability
        • Model Observability
        • Model Optimisation
        • Model Parameters
        • Model Parameters Tuning
        • Model parameters vs hyperparameters
        • Model Random States
        • Model Selection
        • Model Training
        • Model Validation
        • model-agnostic feature importance
        • Momentum
        • Moving Average Forecast
        • Multinomial Naive bayes
        • Multiple Linear Regression
        • Naive Bayes Classifier
        • Naive Forecast
        • Neural network
        • Neural Network Classification
        • Neural network in Practice
        • Neural Scaling Laws
        • Non-negative matrix factorization in ML
        • Non-parametric tests
        • Normalisation of data
        • Normalisation vs Standardisation
        • objective function
        • One-hot encoding
        • Optimisation function
        • Optimisation techniques
        • Optimising a Logistic Regression Model
        • Optimising Neural Networks
        • Optuna
        • Order matters in Boosting
        • Ordinary Least Squares
        • Orthogonalization
        • Outliers
        • Over parameterised models
        • Partial Dependence Plot
        • PCA Explained Variance Ratio
        • PCA Principal Components
        • PCA-Based Anomaly Detection
        • Percentile Detection
        • Performance Drift
        • Polynomial Regression
        • Positional Encoding
        • Precision
        • Precision or Recall
        • Precision-Recall Curve
        • Prediction Intervals vs Confidence Interval
        • Principal Component Analysis
        • PyCaret
        • PyOD
        • PyTorch
        • Pytorch vs Tensorflow
        • Q-Learning
        • Random Forest
        • Random Forest for Time Series
        • Random Forest Interpretability
        • Recall
        • Recommender systems
        • Recurrent Neural Networks
        • Regression
        • Regression Metrics
        • Regularisation
        • Regularisation of Tree based models
        • Reinforcement learning
        • Relationships in memory
        • Reward Function
        • Ridge
        • ROC (Receiver Operating Characteristic)
        • Sammon’s Mapping
        • SARIMA
        • Scikit-Learn
        • Secretary Problem
        • semi-structured data
        • Sentence Transformers
        • Sklearn Pipeline
        • Specificity
        • Spectral Clustering
        • Supervised Learning
        • Support Vector Classifier
        • Support Vector Machines
        • Support Vector Regression
        • Symbolic Regression
        • Tensorflow
        • Test Loss When Evaluating Models
        • Text Classification
        • Time Series Python Packages
        • Train-Dev-Test Sets
        • Transfer Learning
        • Transformed Target Regressor
        • Transformer
        • Transformers vs RNNs
        • Type I Error (False Positive)
        • Type II Error (False Negative)
        • Types of Neural Networks
        • Typical Output Formats in Neural Networks
        • UMAP
        • Unsupervised Learning
        • Use Cases for a Simple Neural Network Like
        • vanishing and exploding gradients problem
        • Variability in linear models
        • Variance in ML
        • Vector Embedding
        • WCSS and elbow method
        • Weak Learners
        • When and why not to us regularisation
        • Why does increasing the number of models in a ensemble not necessarily improve the accuracy
        • Why does the Adam Optimizer converge
        • Why Removing Outliers May Improve Regression but Harm Classification
        • Why standardise features
        • Why Type 1 and Type 2 matter
        • Wrapper Methods
        • Xaiver
        • XGBoost
      • natural-language
        • AI Agents Memory
        • Attention mechanism
        • Bag of words
        • BERT
        • BERTScore
        • Chain of thought
        • ChatGPT
        • Claude
        • Comparing LLMs
        • Distillation
        • ElasticSearch
        • Embedded Methods
        • embeddings for OOV words
        • Evaluate Embedding Methods
        • Fuzzywuzzy
        • Generative AI
        • Generative AI From Theory to Practice
        • Grammar method
        • Guardrails
        • How businesses use Gen AI
        • How LLMs store facts
        • How to reduce the need for Gen AI responses
        • How would you decide between using TF-IDF and Word2Vec for text vectorization
        • In NER how would you handle ambiguous entities
        • Key Components of Attention and Formula
        • Knowledge graph vs RAG setup
        • Language Model Output Optimisation
        • Language Models
        • Language Models Large (LLMs) vs Small (SLMs)
        • lemmatization
        • LLM
        • LLM Memory
        • Local LLM use cases
        • Mathematical Reasoning in Transformers
        • Mixture of Experts
        • Model Cascading
        • Multi-head attention
        • Named Entity Recognition
        • NER Implementation
        • Ngrams
        • NLP
        • NLP Portal
        • nltk
        • Non-negative Matrix Factorization
        • NotebookLM
        • OOV words
        • Pandas Dataframe Agent
        • Part of speech tagging
        • Prompt Engineering
        • prompt retrievers
        • Prompts
        • Pyright
        • RAG
        • Scaling Agentic Systems
        • Self attention vs multi-head attention
        • Self-Attention
        • Semantic Relationships
        • Semantic search
        • Sentence Similarity
        • Sentence Transformer Workflow
        • Similarity Search
        • Small Language Models
        • spaCy
        • Stemming
        • stopwords
        • Summarisation
        • syntactic relationships
        • Text2Cypher
        • TF-IDF
        • TF-IDF Implementation
        • Tokenisation
        • topic modeling
        • Vectorisation
        • Why is named entity recognition (NER) a challenging task
        • Word2vec
        • WordNet
      • OTHER
        • Addressing_Multicollinearity.py
        • algebraic chess notation
        • Bag_of_Words.py
        • Bandit example output
        • Bandit_Example_Fixed.py
        • Click_Implementation.py
        • Comparing_Ensembles.py
        • Cross_Entropy_Single.py
        • Cross_Entropy.py
        • Debugging.py
        • Distribution_Analysis.py
        • Factor_Analysis.py
        • FastAPI_Example.py
        • Forecasting_AutoArima.py
        • Forecasting_Baseline.py
        • Forecasting_Exponential_Smoothing.py
        • Gaussian_Mixture_Model_Implementation.py
        • Handling_Missing_Data_Basic.ipynb
        • Handling_Missing_Data.ipynb
        • Imbalanced_Datasets_SMOTE.py
        • K_Means.py
        • Momentum.py
        • One_hot_encoding.py
        • Pandas_Common.py
        • Pandas_Stack.py
        • PCA_Analysis.ipynb
        • PCA_Based_Anomaly_Detection.py
        • PGN
        • Pycaret_Anomaly.ipynb
        • Pycaret_Example.py
        • Pydantic_More.py
        • Pydantic.py
        • Regression_Logistic_Metrics.ipynb
        • ROC_Curve.py
        • SVM_Example.py
        • Testing_Pytest.py
        • Testing_unittest.py
        • transfer_learning.py
        • TS_Anomaly_Detection.py
        • Vector_Embedding.py
        • Wikipedia_API.py
        • Word2Vec.py
      • PAPER
        • Attention Is All You Need
        • BERT Pretraining of Deep Bidirectional Transformers for Language Understanding
      • project-management
        • 1-on-1 Template
        • 1-to-1's with a Line Manager
        • Asking questions
        • Being a Facilitator
        • Change Management
        • Communication principles
        • Communication Techniques
        • Communication with Stakeholders
        • Communications
        • Conceptual Model
        • Data Storytelling
        • Documentation
        • Education and Training
        • Experiment Plan Template
        • Feedback Template
        • Fishbone diagram
        • How to do git commit messages properly
        • html
        • Innovation
        • Jobs to be done
        • Jupyter Book
        • Locus of Control
        • Managing Data Science Teams
        • Minto Pyramid Principle
        • Modern data team
        • nbconvert slideshows
        • One Pager Template
        • pdoc
        • Problem Definition
        • Process for prototyping
        • project management
        • Project Management Portal
        • Pull Request Template
        • RACI
        • Remaining useful life models
        • Return of Experience Form
        • Reveal.js
        • STAR Job Interview Method
        • Technical Debt
        • Tell me about yourself question
        • UML
        • Why use ER diagrams
      • statistics
        • Addressing Multicollinearity
        • ANOVA
        • Assumption of Normality
        • Bernoulli
        • Bootstrap Sampling
        • Casual Inference
        • Central Limit Theorem
        • Central Limit Theorem & Small Sample Sizes
        • Chi-Squared Test
        • Confidence Interval
        • Correlation
        • Correlation vs Causation
        • Cosine Similarity
        • Covariance
        • Covariance vs Correlation
        • Cryptography
        • Differentation
        • Distributions
        • dta
        • EM Algorithm
        • Factor Analysis
        • Gaussian Distribution
        • Graph Theory
        • Grouped plots
        • Handling Different Distributions
        • Hypothesis testing
        • information theory
        • Interquartile Range (IQR) Detection
        • Johnson–Lindenstrauss lemma
        • Markov chain
        • Mathematics
        • Mean Absolute Error
        • Mean Squared Error
        • mean vs median
        • Model Understanding
        • Multicollinearity
        • non-parametric
        • Odds
        • Odds vs Probability
        • p values
        • Parametric tests
        • parametric vs non-parametric models
        • parametric vs non-parametric tests
        • parsimonious
        • Prediction Intervals
        • Probability
        • Proportion Test
        • Q-Q Plot
        • R
        • R squared
        • R-squared metric not always a good indicator of model performance in regression
        • Reasoning tokens
        • Resampling
        • Root Mean Squared Error
        • Spearman vs Pearson Correlation
        • Standard deviation
        • Standardisation
        • Statistical Assumptions
        • Statistical Modeling
        • Statistical Tests
        • Statistical theorems
        • Statistics
        • statsmodels
        • Stochastic Gradient Descent
        • Stochastic Modeling
        • Symbolic computation
        • Sympy
        • T-test
        • univariate vs multivariate
        • Variance
        • Violin plot
        • Z-Normalisation
        • Z-Score
        • Z-Scores vs Prediction Intervals
        • Z-Test
      • uncategorised
        • Balancing Mechanism Implications
        • Balancing Mechanism Operation
        • Balancing Mechanism Units
        • Demand Response Baselines
        • Demand Response Economics
        • Demand Response in Markets
        • Demand Response Operation
        • Demand Side Flexibility
        • Demand Side Response
        • Distribution Network Operators
        • Elexon
        • Elexon and Settlement
        • Energy Sector Portal
        • Energy Trading
        • National Grid ESO
        • NIV-chasing
        • statistical tests
        • think_stats
        • Untitled
        • Virtual Power Plant
    • copilot
      • copilot-conversations
        • give_me_an_example@20260403_221021
        • who_is_a_supplier@20260403_220923
      • copilot-custom-prompts
        • Clip Web Page
        • Clip YouTube Transcript
        • Emojify
        • Explain like I am 5
        • Fix grammar and spelling
        • Generate glossary
        • Generate table of contents
        • Make longer
        • Make shorter
        • Remove URLs
        • Rewrite as tweet
        • Rewrite as tweet thread
        • Simplify
        • Summarize
        • Translate to Chinese
      • pages
        • Data Archive
        • DE_Tools
        • ML_Tools
        • Quotes
        • Research Questions
        • Reviews
    Home

    ❯

    categories

    ❯

    uncategorised

    Folder: categories/uncategorised

    20 items under this folder.

    • 05 Apr 2026

      Demand Response in Markets

      • energy
      • market_design
    • 05 Apr 2026

      Demand Side Flexibility

      • energy
      • power_systems
    • 05 Apr 2026

      Demand Side Response

      • energy
      • terminology
    • 05 Apr 2026

      Distribution Network Operators

      • 05 Apr 2026

        Elexon and Settlement

        • energy
        • market_settlement
      • 05 Apr 2026

        Elexon

        • 05 Apr 2026

          Energy Sector Portal

          • 05 Apr 2026

            Energy Trading

            • 05 Apr 2026

              NIV-chasing

              • 05 Apr 2026

                National Grid ESO

                • 05 Apr 2026

                  Untitled

                  • 05 Apr 2026

                    Virtual Power Plant

                    • 05 Apr 2026

                      statistical tests

                      • 05 Apr 2026

                        think_stats

                        • 05 Apr 2026

                          Balancing Mechanism Implications

                          • energy
                          • market_dynamics
                          • renewables
                        • 05 Apr 2026

                          Balancing Mechanism Operation

                          • energy
                          • system_operation
                        • 05 Apr 2026

                          Balancing Mechanism Units

                          • energy
                          • market_participants
                        • 05 Apr 2026

                          Demand Response Baselines

                          • energy
                          • measurement
                        • 05 Apr 2026

                          Demand Response Economics

                          • economics
                          • energy
                        • 05 Apr 2026

                          Demand Response Operation

                          • energy
                          • renewables
                          • system_operation

                        Backlinks

                        • No backlinks found
                          • categories
                            • computer-science
                              • Algorithms
                              • Big O Notation
                              • BM25 (Best Match 25)
                              • Checksum
                              • Computer Science
                              • Concurrency
                              • Convex Optimisation
                              • csv module
                              • Directed Acyclic Graph (DAG)
                              • Flask
                              • garbage collector
                              • Generators in Python
                              • Hash
                              • Heap Data Structure
                              • Heap Memory
                              • How to search within a graph
                              • Immutable vs mutable
                              • Java
                              • Java vs JavaScript
                              • JavaScript
                              • Knowledge Graph
                              • Langchain
                              • Machine Learning Algorithms
                              • Monte Carlo Simulation
                              • Multiprocessing vs Multithreading
                              • Multithreading
                              • neomodel
                              • Numpy
                              • Processes vs Threads
                              • programming languages
                              • PyGraphviz
                              • QuickSort
                              • Ranking models
                              • Recursive Algorithm
                              • Science Portal
                              • Strongly vs Weakly typed language
                              • Times Series Python Packages
                            • data-analysis
                              • Altair
                              • altair versus seaborn
                              • Boxplot
                              • Dash
                              • Dashboarding
                              • Dashboards
                              • Data Analysis
                              • Data Analysis Portal
                              • Data Analyst
                              • Data Distribution
                              • Data Mining
                              • Data Product
                              • Data Reduction
                              • Data Visualisation
                              • DuckDB
                              • EDA
                              • ER Diagrams
                              • Heatmap
                              • Label encoding
                              • Linear Discriminant Analysis
                              • Log transformation
                              • Looker Studio
                              • MariaDB vs MySQL
                              • Melt
                              • Multiple Correspondence Analysis
                              • Multivariate Analysis
                              • OLAP
                              • Page Rank
                              • Parquet
                              • Plotly
                              • PowerBI
                              • Preprocessing
                              • Preprocessing Text Classification
                              • Seaborn
                              • SQL Window functions
                              • t-SNE
                              • Tableau
                            • data-engineering
                              • ACID Transaction
                              • Ada boosting
                              • Adding a database to PostgreSQL
                              • Aggregation
                              • Apache Iceberg
                              • Attack mitigation
                              • Attack types
                              • AWS Lambda
                              • Azure
                              • Benefits of Data Transformation
                              • Big Data
                              • BigQuery
                              • Cassandra
                              • Cloud Providers
                              • Coaching & Mentoring
                              • Columnar Storage
                              • Command Prompt
                              • Common Table Expression
                              • Components of the database
                              • Covering Index
                              • Crosstab
                              • CRUD
                              • CUDA
                              • Curse of dimensionality
                              • Cypher
                              • Data Architect
                              • Data Architecture
                              • Data Cleansing
                              • Data Contract
                              • Data Deployment
                              • Data Dictionary
                              • Data Drift
                              • Data Engineering
                              • Data Engineering Portal
                              • Data Engineering Tools
                              • Data Evaluation
                              • Data Hierarchy of Needs
                              • Data Integration
                              • Data Integrity
                              • Data Lake
                              • Data Lakehouse
                              • Data Leakage
                              • Data Lifecycle Management
                              • data lineage
                              • Data Management
                              • Data Modeling
                              • Data Observability
                              • Data Principles
                              • Data Quality
                              • Data Security
                              • Data Selection
                              • Data Sources
                              • Data Storage
                              • Data Transformation
                              • Data Transformation in Data Engineering
                              • Data Transformation with Pandas
                              • Data Validation
                              • Data Virtualization
                              • Data Warehouse
                              • Database
                              • Database Index
                              • Database Management System (DBMS)
                              • Database Schema
                              • Database Storage
                              • Database Techniques
                              • DataOps
                              • design pattern
                              • Digital twin
                              • Distributed Computing
                              • DuckDB in python
                              • DuckDB vs SQLite
                              • Durability
                              • ELT
                              • Estimator
                              • ETL
                              • ETL Pipeline Example
                              • ETL vs ELT
                              • EtLT
                              • Event Driven Microservices
                              • Event-Driven Architecture
                              • Fabric
                              • Faker
                              • File Management
                              • Folder Tree Diagram
                              • Foreign Key
                              • Github Actions
                              • Google Sheet Pivots Table
                              • Grain
                              • Graph Query Language
                              • Groupby
                              • Groupby vs Crosstab
                              • heterogeneous features
                              • Honkit
                              • Hosting
                              • How is schema evolution done in practice with SQL
                              • How to normalise a merged table
                              • Implementing Database Schema
                              • Imputation Techniques
                              • in-memory format
                              • incremental synchronization
                              • Indexing in cypher
                              • Inner Join Example
                              • Input is Not Properly Sanitized
                              • Investigate pyodbc
                              • Joining Datasets
                              • Junction Tables
                              • KNIME
                              • Logical Model
                              • Many-to-Many Relationships
                              • map reduce
                              • MariaDB
                              • master data management
                              • Merge
                              • Microsoft Access
                              • Missing Data
                              • Model Deployment
                              • Monolith Architecture
                              • Multi-level index
                              • Multiprocessing
                              • MySql
                              • neo4j
                              • Normalised Schema
                              • NoSQL
                              • Object Relational Mapper
                              • OLTP
                              • Overfitting
                              • Pandas
                              • Pandas join vs merge
                              • Pandas Pivot Table
                              • Pandas Stack
                              • pd.Grouper
                              • pgAdmin
                              • Pgadmin Permissions on Windows
                              • Physical Model
                              • Pickle
                              • Poetry
                              • Polars
                              • PostgreSQL
                              • Postman
                              • PowerShell
                              • Prevention Is Better Than The Cure
                              • Primary Key
                              • Push-Down
                              • Pydantic
                              • Pyright vs Pydantic
                              • Query Optimisation
                              • Querying
                              • Querying Time Series
                              • Race Conditions
                              • Relating Tables Together
                              • Relational Database
                              • reverse etl
                              • rollup
                              • Row parameters in SQL
                              • Row-based Storage
                              • Scalability
                              • Scaling Server
                              • Schema Evolution
                              • Search
                              • Security mitigation
                              • Security Researcher
                              • semantic layer
                              • Single Source of Truth
                              • Sklearn Pipiline
                              • Slowly Changing Dimension
                              • SMSS
                              • Snowflake Schema
                              • Soft Deletion
                              • Software Design Patterns
                              • Spreadsheets vs Databases
                              • SQL
                              • SQL Groupby
                              • SQL Injection
                              • SQL Joins
                              • SQLAlchemy
                              • SQLAlchemy vs. sqlite3
                              • SQLite
                              • SQLite Studio
                              • Star Schema
                              • storage layer object store
                              • Stored Procedures
                              • structured data
                              • Structuring and organizing data
                              • textual user interface
                              • Transaction
                              • Trigger
                              • Turning a flat file into a database
                              • Types of Database Schema
                              • Unix
                              • unstructured data
                              • Usability
                              • Vacuum
                              • Vector Database
                              • Vectorized Engine
                              • View Use Case
                              • Views
                              • Why Use PySpark in Databricks
                              • Windows Subsystem for Linux
                            • data-science
                              • ACF Plots
                              • Additive vs Multiplicative Models Time Series
                              • ADF Test
                              • Agent Exploration
                              • Agentic Solutions
                              • AI
                              • ARIMA
                              • ARIMA vs Random Forest in Time Series
                              • Autocorrelation
                              • Autocorrelation vs Autoregression
                              • Autoregression
                              • Baseline Forecast
                              • Basics of Time Series
                              • Batch gradient descent
                              • Bellman Equations
                              • Bias-Variance Trade Off
                              • Capability
                              • Choosing a Threshold
                              • Choosing the Number of Clusters
                              • Clustermap
                              • Correlated Time Series
                              • Covariance Structures
                              • Cross Validation
                              • Data Assessment
                              • Data Collection
                              • Data Mining - CRISP
                              • Data Preparation
                              • Data Science
                              • Data science modeling tasks
                              • Data Scientist
                              • Data Understanding
                              • Datasets
                              • Decomposition in Time Series
                              • Differencing in Time Series
                              • DS & ML Portal
                              • Dynamic Time Warping
                              • Evaluating Time Series Forecasts
                              • Evolving Seasonality
                              • F-statistic
                              • Feature Engineering
                              • Feature Scaling
                              • Feature Selection vs Feature Importance
                              • Forecasting using Lags
                              • Forecasting with Autoregressive (AR) Models
                              • Forward Propagation
                              • Gaussian Mixture Models
                              • Gitlab
                              • Gompertz Model
                              • Good Enough Principle in Data Projects
                              • Granger Causality Test
                              • GraphRAG
                              • Handling Missing Data
                              • Holt-Winters (Exponential Smoothing)
                              • Holt-Winters vs ARIMA
                              • Holt’s Linear Trend Model (Double Exponential Smoothing)
                              • how do you do the data selection
                              • Imbalanced Datasets
                              • Interpolation
                              • Intervention Analysis
                              • Joining Time Series
                              • Kernel Machines
                              • KPSS Test
                              • Latency
                              • Logistic Model Curve
                              • LSTM in Time Series
                              • Mean Absolute Percentage Error
                              • MNIST
                              • Model Evaluation by Prediction Difference
                              • Normalisation
                              • Out-of-sample rolling forecast evaluation
                              • PACF Plots
                              • Performance Dimensions
                              • pmdarima
                              • Properties of Time Series Models
                              • Prophet
                              • Random Forest Regression
                              • Residuals Analysis
                              • Rolling Mean vs Cumulative Mean
                              • Scatter Plots
                              • Scientific Method
                              • Scipy
                              • Seasonal Naive Forecast
                              • Seasonality in Time Series
                              • SHapley Additive exPlanations
                              • Shot Learning
                              • Silhouette Analysis
                              • Simple Exponential Smoothing (SES)
                              • sklearn datasets
                              • SMOTE (Synthetic Minority Over-sampling Technique)
                              • SparseCategorialCrossentropy or CategoricalCrossEntropy
                              • stack memory
                              • Stacking
                              • Stationary Time Series
                              • STL Decomposition
                              • Time sampling
                              • Time Series
                              • Time Series Forecasting
                              • Time Series Forecasts in Business
                              • Time Series Learning Resources
                              • Time Series Shapelet
                              • Time Series Shocks
                              • Trends in Time Series
                              • Validation
                              • Varmax
                            • deep-learning
                              • Convolutional Neural Networks
                              • Deep Learning
                              • How is reinforcement learning being combined with deep learning
                              • LSTM
                              • Multi-Agent Reinforcement Learning
                              • Policy
                              • Relu
                              • Sarsa
                            • devops
                              • AB testing
                              • Alternatives to Batch Processing
                              • Amazon S3
                              • Apache Airflow
                              • Apache Kafka
                              • Apache Spark
                              • API
                              • API Driven Microservices
                              • appscript
                              • Bash
                              • bat
                              • Batch Processing
                              • Batch vs PowerShell scripts
                              • Binder
                              • Catalogs, Schemas, and Tables in Databricks
                              • CI-CD
                              • Click
                              • Clustering_Dashboard.py
                              • Code Diagrams
                              • Code Prompting with AI
                              • Command Line
                              • Continuous Delivery - Deployment
                              • Continuous Integration
                              • Cron jobs
                              • dagster
                              • Data Ingestion
                              • Data Maturity
                              • Data Orchestration
                              • Data Pipeline
                              • Data Pipeline to Data Products
                              • Data Streaming
                              • Databricks
                              • Databricks & dbt
                              • Databricks Features
                              • Databricks vs Snowflake
                              • dbt
                              • Debugging
                              • Declarative Data Pipeline
                              • Delta Tables in Databricks
                              • dependency manager
                              • DevOps
                              • Devops Portal
                              • Digital Transformation
                              • Docker
                              • Docker Image
                              • Elastic Net
                              • Environment Variables
                              • Epub
                              • Event Driven
                              • Event Driven Events
                              • Everything
                              • Excel
                              • Excel pivot table
                              • Excel vs Google Sheets
                              • FastAPI
                              • Firebase
                              • frontend
                              • functional programming
                              • GIS
                              • Git
                              • Github Actions
                              • Github Gists
                              • gitlab-ci.yml
                              • Global Interpreter Lock
                              • Google Cloud Platform
                              • Google Colab
                              • Google My Maps Data Extraction
                              • Google Sheets
                              • GPT
                              • Gradio
                              • Grep
                              • Hadoop
                              • Hugging Face
                              • imperative
                              • ipynb
                              • jinja template
                              • Json
                              • Json to SQLite
                              • jupytext
                              • Justfile
                              • kubernetes
                              • Load Balancing
                              • Loading Google Sheets into Databricks
                              • Maintainability
                              • Maintainable Code
                              • Makefile
                              • Master Observability Datadog
                              • Memory
                              • Memory Caching
                              • Microsoft
                              • MongoDB
                              • nbconvert
                              • NET
                              • Node.js
                              • Normalisation of Text
                              • npm
                              • Overwriting and Refreshing Tables in Databricks
                              • Pandas Series vs DataFrame
                              • Pandoc
                              • PMML
                              • Powerquery
                              • Powershell scripts
                              • Powershell versus Command Prompt
                              • Powershell vs Bash
                              • Publish and Subscribe
                              • PySpark
                              • Pytest
                              • Python
                              • Quartz
                              • Random Access Memory
                              • React
                              • Registering a Scheduled Task
                              • renv
                              • REST API
                              • Scala
                              • Security Vulnerabilities
                              • shapefile
                              • Sharepoint
                              • Shiny (R)
                              • Snowflake
                              • Snowflake vs Hadoop
                              • Software Development Life Cycle
                              • Spark DataFrames in Databricks
                              • SQL vs NoSQL
                              • Streamlit
                              • Streamlit 1
                              • Streamlit vs Dash
                              • Technical Design Doc Template
                              • Terminal commands
                              • Testing
                              • tidyverse
                              • TOML
                              • tool.bandit
                              • tool.ruff
                              • tool.uv
                              • Types of Computational Bugs
                              • TypeScript
                              • Ubuntu
                              • unittest
                              • Using requirements or env.yml
                              • uv
                              • Vercel
                              • Virtual environments
                              • Web Feature Server (WFS)
                              • Web Map Tile Service (WMTS)
                              • Why JSON is Better than Pickle for Untrusted Data
                              • Windows
                              • Windows Scheduled Tasks
                              • yaml
                            • industry
                              • AI Engineer
                              • AI governance
                              • Analytics Engineer
                              • Ancillary Services
                              • Balancing Mechanism
                              • Bids and Offers (Balancing Mechanism)
                              • business intelligence
                              • Business observability
                              • Business Understanding
                              • Business Values
                              • Data AI Education at Work
                              • Data Engineer
                              • Data Governance
                              • Data in Energy Flexibility
                              • data literacy
                              • Data Roles
                              • Data Steward
                              • Demand Response
                              • Demand Response Forecasting
                              • Demand Response Types
                              • Design Thinking Questions
                              • Documentation & Meetings
                              • Energy
                              • Energy ABM
                              • Energy Demand
                              • Energy Flexibility
                              • Energy Grid
                              • Energy Storage
                              • Energy Suppliers
                              • Facts
                              • FlexGO
                              • Flexibility Markets
                              • Flexitricity
                              • Flexitricity & Elexon
                              • Gartner Hype Cycle
                              • Imbalance Pricing
                              • Industries of interest
                              • Knowledge Work
                              • Managing People
                              • ML Engineer
                              • ML in Flexible Energy Systems
                              • Network Design
                              • Operational Resilience for Growth and Adaptability
                              • Reporting
                              • Scaling Data Science Capability
                              • Smart Grids
                              • Telecommunications
                              • Thinking Systems
                              • Use of RNNs in energy sector
                              • Weather derivatives
                              • Working with SMEs
                            • machine-learning
                              • Accuracy
                              • Activation atlases
                              • Activation Function
                              • Active Learning
                              • Adam Optimizer
                              • Adaptive Learning Rates
                              • Adjusted R squared
                              • Agent-Based Modelling
                              • AIC in Model Evaluation
                              • Anomaly Detection
                              • Anomaly Detection in Time Series
                              • Anomaly Detection with Clustering
                              • Anomaly Detection with Statistical Methods
                              • Assessing Gen AI generated content
                              • AUC
                              • Automated Feature Creation
                              • AutoML
                              • Backpropagation
                              • Bagging
                              • Bagging Classifier vs Random Forest Classifier
                              • Bagging vs Boosting
                              • Batch Normalisation
                              • Bias in ML
                              • Binary Classification
                              • Boosting
                              • Business value of anomaly detection
                              • CART
                              • CatBoost
                              • Challenges to Model Deployment
                              • Class Separability
                              • Classification
                              • Classification Report
                              • Cluster Density
                              • Cluster Seperation
                              • Clustering
                              • Collaborative Filtering
                              • conceptual data model
                              • Confusion Matrix
                              • Cost Function
                              • Cost-Sensitive Analysis
                              • Cross Entropy
                              • Customer Growth Modeling
                              • Data Selection in ML
                              • Data Transformation in Machine Learning
                              • DBSCAN
                              • Decision Theory
                              • Decision Tree
                              • Decision Trees are Fragile
                              • Deep Learning Frameworks
                              • Deep Q-Learning
                              • Dendrograms
                              • Determining Threshold Values
                              • Dimension Table
                              • Dimensional Modelling
                              • Dimensionality Reduction
                              • Dimensions
                              • Distributions in Decision Tree Leaves
                              • Dropout
                              • Dummy variable trap
                              • Edge ML
                              • emergent behavior
                              • Encoding Categorical Variables
                              • Epoch
                              • Evaluating Language Models
                              • Evaluating Logistic Regression
                              • Evaluating the effectiveness of prompts
                              • Evaluation Metrics
                              • Exploration vs Exploitation
                              • Exponential Smoothing
                              • f-regression
                              • F1 Score
                              • Fact Table
                              • FAISS
                              • Feature Engineering for Time Series
                              • Feature Evaluation
                              • Feature Extraction
                              • Feature Importance
                              • Feature Selection
                              • Feature Transformations
                              • Feed Forward Neural Network
                              • Filter Methods
                              • Fitting weights and biases of a neural network
                              • Framework for models
                              • Gaussian Model
                              • General Linear Regression
                              • Generalisation
                              • Generative Adversarial Networks
                              • Gini Impurity
                              • Gini Impurity vs Cross Entropy
                              • Gradient Boosted Trees
                              • Gradient Boosting
                              • Gradient Boosting Regressor
                              • Gradient Descent
                              • Gradient descent in linear regression
                              • granularity
                              • Graph Neural Network
                              • Graph Theory Community
                              • GridSeachCv
                              • Growth Models in Time Series
                              • GRU
                              • Hierarchical Clustering
                              • High cross validation accuracy is not directly proportional to performance on unseen test data
                              • Histogram
                              • How do we evaluate of LLM Outputs
                              • How to use Sklearn Pipeline
                              • Hyperparameter
                              • Hyperparameter Tuning
                              • ICE Plot
                              • Impact of multicollinearity on model parameters
                              • Inertia K Means Cost Function
                              • inference
                              • inference versus prediction
                              • initialization methods
                              • Interoperability
                              • interoperable
                              • Interpretability
                              • Interpreting logistic regression model parameters
                              • Isolated Forest
                              • Jaccard Coefficient
                              • K-means
                              • K-nearest neighbours
                              • Keras
                              • Kernel Density Estimation
                              • Kernelling
                              • Kmeans vs GMM
                              • L1 Regularisation
                              • Label encoding vs One-hot encoding
                              • Labelling data
                              • Lagrange multipliers in optimisation
                              • lambda architecture
                              • Latent Dirichlet Allocation
                              • Latent Semantic Indexing
                              • LBFGS
                              • Learning Curve
                              • Learning Rate
                              • Learning Styles
                              • LightGBM
                              • LightGBM vs XGBoost vs CatBoost
                              • Linear Regression
                              • LLM Evaluation Metrics
                              • Local Interpretable Model-agnostic Explainations
                              • Local Outlier Factor (LOF)
                              • Logistic Regression
                              • Logistic Regression does not predict probabilities
                              • Logistic regression in sklearn & Gradient Descent
                              • Logistic Regression Statsmodel Summary table
                              • Loss function
                              • Loss versus Cost function
                              • Machine Learning
                              • Machine Learning Operations
                              • Manifold Learning
                              • Markov Decision Processes
                              • Maximum Likelihood Estimation
                              • Median Absolute Error
                              • Mermaid
                              • Metadata Handling
                              • Methods for Handling Outliers
                              • Metric
                              • Mini-batch gradient descent
                              • Model Building
                              • Model Deployment using PyCaret
                              • Model Ensemble
                              • Model Evaluation
                              • Model Evaluation vs Model Optimisation
                              • Model Interpretability
                              • Model Observability
                              • Model Optimisation
                              • Model Parameters
                              • Model Parameters Tuning
                              • Model parameters vs hyperparameters
                              • Model Random States
                              • Model Selection
                              • Model Training
                              • Model Validation
                              • model-agnostic feature importance
                              • Momentum
                              • Moving Average Forecast
                              • Multinomial Naive bayes
                              • Multiple Linear Regression
                              • Naive Bayes Classifier
                              • Naive Forecast
                              • Neural network
                              • Neural Network Classification
                              • Neural network in Practice
                              • Neural Scaling Laws
                              • Non-negative matrix factorization in ML
                              • Non-parametric tests
                              • Normalisation of data
                              • Normalisation vs Standardisation
                              • objective function
                              • One-hot encoding
                              • Optimisation function
                              • Optimisation techniques
                              • Optimising a Logistic Regression Model
                              • Optimising Neural Networks
                              • Optuna
                              • Order matters in Boosting
                              • Ordinary Least Squares
                              • Orthogonalization
                              • Outliers
                              • Over parameterised models
                              • Partial Dependence Plot
                              • PCA Explained Variance Ratio
                              • PCA Principal Components
                              • PCA-Based Anomaly Detection
                              • Percentile Detection
                              • Performance Drift
                              • Polynomial Regression
                              • Positional Encoding
                              • Precision
                              • Precision or Recall
                              • Precision-Recall Curve
                              • Prediction Intervals vs Confidence Interval
                              • Principal Component Analysis
                              • PyCaret
                              • PyOD
                              • PyTorch
                              • Pytorch vs Tensorflow
                              • Q-Learning
                              • Random Forest
                              • Random Forest for Time Series
                              • Random Forest Interpretability
                              • Recall
                              • Recommender systems
                              • Recurrent Neural Networks
                              • Regression
                              • Regression Metrics
                              • Regularisation
                              • Regularisation of Tree based models
                              • Reinforcement learning
                              • Relationships in memory
                              • Reward Function
                              • Ridge
                              • ROC (Receiver Operating Characteristic)
                              • Sammon’s Mapping
                              • SARIMA
                              • Scikit-Learn
                              • Secretary Problem
                              • semi-structured data
                              • Sentence Transformers
                              • Sklearn Pipeline
                              • Specificity
                              • Spectral Clustering
                              • Supervised Learning
                              • Support Vector Classifier
                              • Support Vector Machines
                              • Support Vector Regression
                              • Symbolic Regression
                              • Tensorflow
                              • Test Loss When Evaluating Models
                              • Text Classification
                              • Time Series Python Packages
                              • Train-Dev-Test Sets
                              • Transfer Learning
                              • Transformed Target Regressor
                              • Transformer
                              • Transformers vs RNNs
                              • Type I Error (False Positive)
                              • Type II Error (False Negative)
                              • Types of Neural Networks
                              • Typical Output Formats in Neural Networks
                              • UMAP
                              • Unsupervised Learning
                              • Use Cases for a Simple Neural Network Like
                              • vanishing and exploding gradients problem
                              • Variability in linear models
                              • Variance in ML
                              • Vector Embedding
                              • WCSS and elbow method
                              • Weak Learners
                              • When and why not to us regularisation
                              • Why does increasing the number of models in a ensemble not necessarily improve the accuracy
                              • Why does the Adam Optimizer converge
                              • Why Removing Outliers May Improve Regression but Harm Classification
                              • Why standardise features
                              • Why Type 1 and Type 2 matter
                              • Wrapper Methods
                              • Xaiver
                              • XGBoost
                            • natural-language
                              • AI Agents Memory
                              • Attention mechanism
                              • Bag of words
                              • BERT
                              • BERTScore
                              • Chain of thought
                              • ChatGPT
                              • Claude
                              • Comparing LLMs
                              • Distillation
                              • ElasticSearch
                              • Embedded Methods
                              • embeddings for OOV words
                              • Evaluate Embedding Methods
                              • Fuzzywuzzy
                              • Generative AI
                              • Generative AI From Theory to Practice
                              • Grammar method
                              • Guardrails
                              • How businesses use Gen AI
                              • How LLMs store facts
                              • How to reduce the need for Gen AI responses
                              • How would you decide between using TF-IDF and Word2Vec for text vectorization
                              • In NER how would you handle ambiguous entities
                              • Key Components of Attention and Formula
                              • Knowledge graph vs RAG setup
                              • Language Model Output Optimisation
                              • Language Models
                              • Language Models Large (LLMs) vs Small (SLMs)
                              • lemmatization
                              • LLM
                              • LLM Memory
                              • Local LLM use cases
                              • Mathematical Reasoning in Transformers
                              • Mixture of Experts
                              • Model Cascading
                              • Multi-head attention
                              • Named Entity Recognition
                              • NER Implementation
                              • Ngrams
                              • NLP
                              • NLP Portal
                              • nltk
                              • Non-negative Matrix Factorization
                              • NotebookLM
                              • OOV words
                              • Pandas Dataframe Agent
                              • Part of speech tagging
                              • Prompt Engineering
                              • prompt retrievers
                              • Prompts
                              • Pyright
                              • RAG
                              • Scaling Agentic Systems
                              • Self attention vs multi-head attention
                              • Self-Attention
                              • Semantic Relationships
                              • Semantic search
                              • Sentence Similarity
                              • Sentence Transformer Workflow
                              • Similarity Search
                              • Small Language Models
                              • spaCy
                              • Stemming
                              • stopwords
                              • Summarisation
                              • syntactic relationships
                              • Text2Cypher
                              • TF-IDF
                              • TF-IDF Implementation
                              • Tokenisation
                              • topic modeling
                              • Vectorisation
                              • Why is named entity recognition (NER) a challenging task
                              • Word2vec
                              • WordNet
                            • OTHER
                              • Addressing_Multicollinearity.py
                              • algebraic chess notation
                              • Bag_of_Words.py
                              • Bandit example output
                              • Bandit_Example_Fixed.py
                              • Click_Implementation.py
                              • Comparing_Ensembles.py
                              • Cross_Entropy_Single.py
                              • Cross_Entropy.py
                              • Debugging.py
                              • Distribution_Analysis.py
                              • Factor_Analysis.py
                              • FastAPI_Example.py
                              • Forecasting_AutoArima.py
                              • Forecasting_Baseline.py
                              • Forecasting_Exponential_Smoothing.py
                              • Gaussian_Mixture_Model_Implementation.py
                              • Handling_Missing_Data_Basic.ipynb
                              • Handling_Missing_Data.ipynb
                              • Imbalanced_Datasets_SMOTE.py
                              • K_Means.py
                              • Momentum.py
                              • One_hot_encoding.py
                              • Pandas_Common.py
                              • Pandas_Stack.py
                              • PCA_Analysis.ipynb
                              • PCA_Based_Anomaly_Detection.py
                              • PGN
                              • Pycaret_Anomaly.ipynb
                              • Pycaret_Example.py
                              • Pydantic_More.py
                              • Pydantic.py
                              • Regression_Logistic_Metrics.ipynb
                              • ROC_Curve.py
                              • SVM_Example.py
                              • Testing_Pytest.py
                              • Testing_unittest.py
                              • transfer_learning.py
                              • TS_Anomaly_Detection.py
                              • Vector_Embedding.py
                              • Wikipedia_API.py
                              • Word2Vec.py
                            • PAPER
                              • Attention Is All You Need
                              • BERT Pretraining of Deep Bidirectional Transformers for Language Understanding
                            • project-management
                              • 1-on-1 Template
                              • 1-to-1's with a Line Manager
                              • Asking questions
                              • Being a Facilitator
                              • Change Management
                              • Communication principles
                              • Communication Techniques
                              • Communication with Stakeholders
                              • Communications
                              • Conceptual Model
                              • Data Storytelling
                              • Documentation
                              • Education and Training
                              • Experiment Plan Template
                              • Feedback Template
                              • Fishbone diagram
                              • How to do git commit messages properly
                              • html
                              • Innovation
                              • Jobs to be done
                              • Jupyter Book
                              • Locus of Control
                              • Managing Data Science Teams
                              • Minto Pyramid Principle
                              • Modern data team
                              • nbconvert slideshows
                              • One Pager Template
                              • pdoc
                              • Problem Definition
                              • Process for prototyping
                              • project management
                              • Project Management Portal
                              • Pull Request Template
                              • RACI
                              • Remaining useful life models
                              • Return of Experience Form
                              • Reveal.js
                              • STAR Job Interview Method
                              • Technical Debt
                              • Tell me about yourself question
                              • UML
                              • Why use ER diagrams
                            • statistics
                              • Addressing Multicollinearity
                              • ANOVA
                              • Assumption of Normality
                              • Bernoulli
                              • Bootstrap Sampling
                              • Casual Inference
                              • Central Limit Theorem
                              • Central Limit Theorem & Small Sample Sizes
                              • Chi-Squared Test
                              • Confidence Interval
                              • Correlation
                              • Correlation vs Causation
                              • Cosine Similarity
                              • Covariance
                              • Covariance vs Correlation
                              • Cryptography
                              • Differentation
                              • Distributions
                              • dta
                              • EM Algorithm
                              • Factor Analysis
                              • Gaussian Distribution
                              • Graph Theory
                              • Grouped plots
                              • Handling Different Distributions
                              • Hypothesis testing
                              • information theory
                              • Interquartile Range (IQR) Detection
                              • Johnson–Lindenstrauss lemma
                              • Markov chain
                              • Mathematics
                              • Mean Absolute Error
                              • Mean Squared Error
                              • mean vs median
                              • Model Understanding
                              • Multicollinearity
                              • non-parametric
                              • Odds
                              • Odds vs Probability
                              • p values
                              • Parametric tests
                              • parametric vs non-parametric models
                              • parametric vs non-parametric tests
                              • parsimonious
                              • Prediction Intervals
                              • Probability
                              • Proportion Test
                              • Q-Q Plot
                              • R
                              • R squared
                              • R-squared metric not always a good indicator of model performance in regression
                              • Reasoning tokens
                              • Resampling
                              • Root Mean Squared Error
                              • Spearman vs Pearson Correlation
                              • Standard deviation
                              • Standardisation
                              • Statistical Assumptions
                              • Statistical Modeling
                              • Statistical Tests
                              • Statistical theorems
                              • Statistics
                              • statsmodels
                              • Stochastic Gradient Descent
                              • Stochastic Modeling
                              • Symbolic computation
                              • Sympy
                              • T-test
                              • univariate vs multivariate
                              • Variance
                              • Violin plot
                              • Z-Normalisation
                              • Z-Score
                              • Z-Scores vs Prediction Intervals
                              • Z-Test
                            • uncategorised
                              • Balancing Mechanism Implications
                              • Balancing Mechanism Operation
                              • Balancing Mechanism Units
                              • Demand Response Baselines
                              • Demand Response Economics
                              • Demand Response in Markets
                              • Demand Response Operation
                              • Demand Side Flexibility
                              • Demand Side Response
                              • Distribution Network Operators
                              • Elexon
                              • Elexon and Settlement
                              • Energy Sector Portal
                              • Energy Trading
                              • National Grid ESO
                              • NIV-chasing
                              • statistical tests
                              • think_stats
                              • Untitled
                              • Virtual Power Plant
                          • copilot
                            • copilot-conversations
                              • give_me_an_example@20260403_221021
                              • who_is_a_supplier@20260403_220923
                            • copilot-custom-prompts
                              • Clip Web Page
                              • Clip YouTube Transcript
                              • Emojify
                              • Explain like I am 5
                              • Fix grammar and spelling
                              • Generate glossary
                              • Generate table of contents
                              • Make longer
                              • Make shorter
                              • Remove URLs
                              • Rewrite as tweet
                              • Rewrite as tweet thread
                              • Simplify
                              • Summarize
                              • Translate to Chinese
                            • pages
                              • Data Archive
                              • DE_Tools
                              • ML_Tools
                              • Quotes
                              • Research Questions
                              • Reviews

                          Backlinks

                          • No backlinks found

                          Created with Quartz v4.3.1 © 2026

                          • GitHub
                          • Linkedin