Q5: How does the ReLU activation function help overcome the vanishing gradient problem?
A5: The rectified linear unit (ReLU) activation function helps overcome the vanishing gradient problem by avoiding gradient saturation. ReLU replaces negative values with zero, ensuring that the gradients flowing backward remain non-zero and do not vanish. This promotes better gradient flow and enables effective learning in deep neural networks. vanishing and exploding gradients problem