5th Place Final — Objective Quest Airnlogoy 3.0

BEM FTTM Airlangga University · National

Placed 5th in the Finals of Objective Quest Airnlogoy 3.0, hosted by BEM FTTM Airlangga University — a national data science competition with 88 competing teams. The task: multiclass classification of web network traffic into attack categories.

The problem

Classify 416,473 network flows (43 features) into 6 traffic categories: Background, Benign, Bruteforce, Bruteforce-XML, Probing, and XMRIGCC CryptoMiner.
Heavily imbalanced dataset — Background and Benign classes dominated, while attack classes (Bruteforce, CryptoMiner) were rare minorities.
Many features had up to 35% missing values; several were algebraically dependent (e.g., flow_packets_per_sec = forward + backward).
Many flag features (SYN, FIN, RST, PSH, etc.) were sparse — over 80% zero values — causing variability issues.

EDA and preprocessing

Identified multicollinearity clusters via correlation matrix — dropped redundant features to give the model compact, non-overlapping information.
Imputed missing values algebraically for dependent features (flow_duration = active + idle, down_up_ratio = backward/forward subflow packets).
Binarized sparse flag features (value > 0 → 1) to preserve signal without noise from near-zero distributions.
Created combinatorial categorical features from origin_host, origin_port, response_host, response_port (0% missing) — e.g., origin_host + response_host — to encode directional network relationships.
Applied backward-forward feature elimination, dropping 28 features to improve model signal density.

Modeling and experiments

Baseline: CatBoost, XGBoost, LightGBM — CatBoost best at 0.858 ± 0.01; became primary model for all experiments.
Undersampling: manually reduced Background/Benign to 50–90% of original. Best local score 0.893 — but severely overfit on Kaggle (private score dropped significantly).
Categorical combination: added concatenated host+port features — improved to 0.850 ± 0.002, used as base for subsequent experiments.
Balanced Class Weight: assigned inverse-frequency weights to penalize misclassification of minority attack classes. Score: 0.855 ± 0.002.
Tuned Class Weight: manually adjusted weights — reduced majority class penalties, boosted minority (Bruteforce ×13.7, CryptoMiner ×19.2). Best result: 0.864 ± 0.002 locally, 0.88271 private Kaggle score.

Results

5th place in finals, 3rd in preliminaries, among 88 teams as team LabtekV.
Best model: CatBoost with Tuned Class Weight — public score 0.88118, private score 0.88271.
Most important feature: host_origin_response (concatenated categorical) — directional network relationship was the strongest predictor of traffic type.
Key finding: undersampling looked great locally but overfit badly; class weighting generalized far better to unseen data.

Links

Report