Fast, interpretable baselines for text classification
Converts text into numerical vectors by weighting words based on frequency and uniqueness. Common words (e.g., "the", "is") get low weights, while distinctive words get high weights.
Formula: TF-IDF(t,d) = TF(t,d) ร IDF(t) = (term freq in doc) ร log(total docs / docs with term)
Probabilistic classifier based on Bayes theorem. Fast, efficient, works well with small datasets.
Linear model with sigmoid activation. Simple, interpretable, good baseline.
Find optimal hyperplane for classification. Powerful for high-dimensional data.
Ensemble of decision trees. Robust, handles non-linearity, reduces overfitting.
Multi-layer perceptron with hidden layers. Can learn complex patterns.
Gradient boosting algorithm. Often wins ML competitions. High performance.
Ready-to-run Python code with all 6 classifiers. Includes dataset download, TF-IDF extraction, training, and evaluation.
Open this notebook in Google Colab for interactive execution with free GPU access.
All dependencies pre-installed
Modify code and see results instantly
Download plots and models
Note: The notebook will open in a new tab. You may need to sign in with your Google account.