About Me

I build forecasting models, risk dashboards, and AI-assisted tools that turn raw data into decisions people can act on.

Master's student in Data Science at the University of Luxembourg. I specialize in time-series forecasting and data analytics, build RAG-based AI assistants, and work end-to-end from feature engineering and model evaluation in Python and SQL to decision dashboards in Streamlit and Power BI.

Current Role MSc Data Science Student
Current Focus Data Analytics, ML, Gen AI, RAG, Forecasting
Location Luxembourg, LU
Availability Open to internships and applied data / AI roles

Skills

I work across the full data and AI workflow: understanding the problem, cleaning and validating data, building features, training and evaluating models, and turning results into dashboards, reports, or AI-assisted tools that people can actually use.

Programming & Data

Core languages & storage

Python SQL R DuckDB MongoDB Neo4j

Machine Learning & Deep Learning

Predictive modelling

Scikit-learn PyTorch TensorFlow XGBoost LightGBM NLP / Transformers

Statistics & Forecasting

Quantitative methods

Probability & Statistics Feature Engineering / PCA Backtesting ARIMA / ARIMAX Holt-Winters LSTM / GRU

Generative AI & LLM Systems

RAG and AI assistants

RAG Pipelines Embeddings / Vector Search FAISS LLM APIs / Inference Prompt Engineering LangChain

Data Analysis & Visualization

Exploration to insight

Pandas / NumPy Matplotlib / Seaborn Plotly / ggplot2 SciPy / Tidyverse Data Storytelling

BI, Reporting & Deployment

Business-facing outputs

Power BI Excel Streamlit Docker Flask / FastAPI Git / GitHub

Projects

Selected projects showing practical work in credit risk analytics, forecasting, business intelligence, and applied machine learning.

Credit Risk BI & Decisioning Dashboard

Analytics Dashboard | Nov 2025 – Feb 2026

Analytics & BI

Credit risk analytics system from ingestion to data-quality contracts, DuckDB warehousing, PD modelling, SQL KPI marts, and Streamlit decision dashboards.

StreamlitDuckDBDocker
View Project
80% variance retained, 285→91 features

Objective

Create a practical credit risk analytics tool for underwriting, portfolio monitoring, probability-of-default scoring, and decision policy simulation.

Approach

Designed a workflow from ingestion to data-quality contracts, DuckDB warehousing, PD modelling, SQL KPI marts, and Streamlit dashboards for model and business reporting.

Results

Implemented model diagnostics including ROC/PR, calibration, Brier score, threshold analysis, and a simulator that converts PD into approve, review, and decline decisions under business constraints.

Tools Used

Python, Streamlit, scikit-learn, DuckDB, SQL, Docker.

Stocks Next-Day Returns Prediction

Time-Series Forecasting | Oct 2025 – Mar 2026

ML Forecasting

End-to-end AAPL forecasting pipeline using 2013–2026 market data, engineered features, train-only PCA, and LSTM-to-GRU modelling.

PythonPyTorchOptuna
View Project
80% variance retained, 285→91 features

Objective

Predict next-day AAPL log returns using daily market data while avoiding data leakage and preserving a realistic time-based modelling setup.

Approach

Built an end-to-end pipeline using yfinance data from 2013–2026, engineered endogenous, exogenous, calendar, lag, and rolling features, then applied train-only StandardScaler and PCA, reducing 285 features to 91 components while retaining 80% variance.

Results

Benchmarked against Naive, ARIMA, ARIMAX, and XGBoost baselines, then trained an LSTM-to-GRU model with Optuna-based hyperparameter search and structured evaluation reporting.

Tools Used

Python, PyTorch, scikit-learn, yfinance, Optuna, PCA, LSTM, GRU.

NVDA News Sentiment Portfolio

Financial NLP | Oct 2025 – Dec 2025

NLP

Pipeline for collecting, de-duplicating, fine-tuning FinBERT on 300 labelled headlines, and aligning NVDA sentiment to return analysis.

FinBERTNewsAPIyfinance
View Project
300 labelled headlines

Objective

Analyze whether NVDA news sentiment can be aligned with next-day and next-3-day stock returns for downstream financial analysis.

Approach

Collected and de-duplicated headlines from NewsAPI and yfinance, fine-tuned FinBERT on a 300-headline investor-viewpoint dataset that was manually labelled, and aligned after-hours news to the next NASDAQ trading day.

Results

Created daily sentiment features including mean sentiment and tail-negativity, then mapped them to next-day and next-3-day NVDA returns and abnormal returns for analysis.

Tools Used

Python, Hugging Face Transformers, FinBERT, NewsAPI, yfinance.

Walmart Weekly Sales Forecasting

Retail Forecasting | Dec 2025 – Feb 2026

Analytics & Forecasting

Retail forecasting workflow evaluated with MAE, RMSE, and WAPE across 840 test points, comparing department-level and aggregate planning approaches.

PandasNumPyMatplotlib
View Project
840 test points evaluated

Objective

Forecast weekly Walmart sales and compare department-level forecasting against an aggregate-forecast-deaggregate strategy.

Approach

Cleaned and validated multi-year retail sales data, then implemented additive Holt-Winters forecasting from scratch with validation-based tuning across store and department time series.

Results

Evaluated forecasts using MAE, RMSE, and WAPE across 840 test points, and modelled a rebate-contract scenario using horizon-2 forecasts and rolling bias correction to study KPI and accuracy trade-offs.

Tools Used

Python, Pandas, NumPy, Matplotlib, Jupyter Notebook.

Experience

Research and internship experience where I worked on applied ML, time-series imputation, cybersecurity data, and AI-assisted analysis tools.

Research Assistant

University of Luxembourg | Oct 2025 – Jan 2026

Worked on TimeCIM / TSGuard for missing-value imputation in satellite sensor time series, including temporal modelling, structured evaluation, and an LLM-based chatbot using the Groq API for simulation monitoring.

ML Intern

Toptech Solutions Private Limited | Feb 2022 – May 2022

Worked on network intrusion detection, including cybersecurity data preprocessing, model development, and evaluation for attack classification.

Research Intern / Visiting Student

École de technologie supérieure | Jul 2021 – Oct 2021

Worked on reinforcement learning and deep learning methods for network security, resource allocation, faulty-data detection, and self-optimized scheduling.

Certifications

Certifications that support my formal and project-based work in machine learning and data science.

Machine Learning Specialization

Stanford University / Coursera

Covers supervised learning, unsupervised learning, model evaluation, neural networks, and practical machine learning workflows.

IBM Data Science Specialization

IBM / Coursera

Applied data science program covering Python, analysis workflows, data visualization, machine learning, and project-based data science practice.

Contact Me