Tim Lukas Adam
Tim Lukas Adam
Final-year Software Engineering student with a focus on applied artificial intelligence, large language models, and data-driven systems. Experienced in developing and evaluating AI solutions through industry-linked projects and research, including local LLM deployment, fine-tuning, benchmarking, and multi-agent systems.
I am a final-year BSc Software Engineering student at the University of Southern Denmark, currently writing my bachelor’s thesis in collaboration with Danfoss. Over the course of my studies, I have become increasingly focused on applied AI, especially large language models, agent-based systems, and the use of intelligent systems in real-world domains.
An exchange semester at HKUST helped sharpen that direction through coursework in machine learning, large language models, and data analysis. Next, I am looking to continue with a master’s degree in AI or Applied AI, with a strong interest in civil engineering, infrastructure, and data-driven systems.
Interests
LLM scaling laws and limitations
Understanding where model capability grows, saturates, or breaks.
Agentic systems
Tool-using systems with observable decisions.
Small language models
Efficient models for local and practical AI.
Applied AI systems
AI for infrastructure and data-driven systems.
Research
CAKE: Cloud Architecture Knowledge Evaluation of Large Language Models
In today's software architecture, large language models (LLMs) serve as software architecture co-pilots. However, no benchmark currently exists to evaluate large language models' actual understanding of cloud-native software architecture. For this reason we present a benchmark called CAKE, which consists of 188 expert-validated questions covering four cognitive levels of Bloom's revised taxonomy -- recall, analyze, design, and implement -- and five cloud-native topics. Evaluation is conducted on 22 model configurations (0.5B--70B parameters) across four LLM families, using three-run majority voting for multiple-choice questions (MCQs) and LLM-as-a-judge scoring for free-responses (FR). Based on this evaluation, four notable findings were identified. First, MCQ accuracy plateaus above 3B parameters, with the best model reaching 99.2%. Second, free-response scores scale steadily across all cognitive levels. Third, the two formats capture different facets of knowledge, as the MCQ accuracy approaches a ceiling while free-responses continue to differentiate models. Finally, reasoning augmentation (+think) improves free-response quality, while tool augmentation (+tool) degrades performance for small models. These results suggest that the evaluation format fundamentally shapes how we measure architectural knowledge in LLMs.
Architecture Without Architects: How AI Coding Agents Shape Software Architecture
AI coding agents select frameworks, scaffold infrastructure, and wire integrations, often in seconds. These are architectural decisions, yet almost no one reviews them as such. We identify five mechanisms by which agents make implicit architectural choices and propose six prompt-architecture coupling patterns that map natural-language prompt features to the infrastructure they require. The patterns range from contingent couplings (structured output validation) that may weaken as models improve to fundamental ones (tool-call orchestration) that persist regardless of model capability. An illustrative demonstration confirms that prompt wording alone produces structurally different systems for the same task. We term the phenomenon vibe architecting, architecture shaped by prompts rather than deliberate design, and outline review practices, decision records, and tooling to bring these hidden decisions under governance.
A Reference Architecture for Agentic Hybrid Retrieval in Dataset Search
Ad hoc dataset search requires matching underspecified natural-language queries against sparse, heterogeneous metadata records, a task where typical lexical or dense retrieval alone falls short. We reposition dataset search as a software-architecture problem and propose a bounded, auditable reference architecture for agentic hybrid retrieval that combines BM25 lexical search with dense-embedding retrieval via reciprocal rank fusion, orchestrated by a large language model agent that repeatedly plans queries, evaluates the sufficiency of results, and reranks candidates. To reduce vocabulary mismatch, we introduce an offline metadata augmentation step in which an LLM generates pseudo-queries for each dataset record, augmenting both retrieval indexes before query time. Two architectural styles are examined: a single ReAct agent and a multi-agent horizontal architecture with Feedback Control. Their quality-attribute tradeoffs are analyzed with respect to modifiability, observability, performance, and governance.
Ongoing Projects
The Trader's Trinity: Forecasting Models, RL Agents, and LLM Judges for Day-Ahead Markets
Investigating how forecasting models, reinforcement learning agents, and LLM-based judges can predict and explain behavior in Danish energy markets.
Education
Semester Abroad
Hong Kong University of Science and Technology (HKUST)
Selected courses focused on machine learning, language models, and applied data analysis:
- COMP4211 Machine Learning
- COMP4901B Large Language Models
- CIVL4610 Data Analysis for Smart Transportation Systems
Bachelor of Software Engineering
University of Southern Denmark, Sonderborg
Project-oriented and applied program with semester projects often developed with companies around real cases and data.
German Abitur — High School Diploma
Gymnasium Kaiser-Friedrich-Ufer, Hamburg