Achievements
🏆
2025Competition
Finalist — Data Mining Category (Gemastik)
Kemendiktisaintek · National
Reached the finals of the Data Mining category at Gemastik XVIII 2025, Indonesia's largest national student technology competition held by Kemendiktisaintek. We built an agentic Text-to-SQL system targeting a real-world problem: the majority of Indonesian civil servants (ASN) lack SQL skills, making data-driven policy making inaccessible without technical staff dependency.
The problem
- Only ~30% of Indonesian civil servants (ASN) can work digitally, and almost none can write SQL — creating an information bottleneck for government data access.
- Policy makers and leaders are forced to rely on a small pool of technical staff just to query databases.
- Goal: build a bridge between natural language and SQL so non-technical users can interact with government databases directly.
What we built
- An agentic Text-to-SQL system that translates natural language questions (English) into executable SQL queries using a Small Language Model (SLM).
- Model: Qwen-2.5-Coder-32B-Instruct — 32.5B parameters, chosen for its coding capability and resource efficiency over large commercial LLMs.
- Architecture: three-area workflow — User Area (input handling), Management Area (schema validation, prompt construction, agent scratchpad), and Generation Area (SLM + tool binding).
- The agent loop uses Model Context Protocol (MCP) for tool binding: sql_db_list_tables, sql_db_schema, sql_db_query — enabling iterative query generation, execution, and self-correction.
- A Thinking Pool stores thought/action/observation traces per iteration, concatenated into the prompt for multi-step reasoning.
- Evaluated on 198 questions from a representative subset of the Spider dataset (6 databases, 69 schemas).
Results
- 100% execution success rate — all 198 questions produced runnable SQL queries.
- Hybrid Similarity: 7.75/10 — combining token similarity (7.37), AST structural similarity (7.89), and embedding-level semantic similarity (9.67).
- SLM-based Semantic Equivalence: 8.19/10 — indicating the system reliably captures the intent behind questions.
- Best performing database: apartment_rentals (hybrid 8.28, semantic 9.24) — simpler linear schema.
- Hardest database: formula_1 (hybrid 7.09, semantic 6.21) — 13 tables with complex seasonal relations (lap_times, pit_stops, constructors).
- Key failure pattern: semantic mismatch on relative/ambiguous terms (e.g., "popular" or "payment methods") — model hallucinates table/column choices when natural language doesn't exactly map to schema names.
Research paper
- Paper title: "Sistem Tanya-Jawab Agentis Berbasis SQL Menggunakan Small Language Model".
- Contribution: demonstrates that a lightweight SLM-based agentic approach can match LLM-scale Text-to-SQL performance at a fraction of the computational cost — suitable for deployment in resource-constrained government environments.
Links