Araca Insights

Production analytics platform for semiconductor wafer polishing research

Lead Engineer 2026 - Present
Python PyQt6 Dash scikit-learn LangChain Ollama

Semiconductor wafer polishing is a precision process where small variations in force, temperature, and friction determine whether a wafer meets spec or gets scrapped. Engineers were spending 100+ hours per study manually processing sensor data - copying numbers between spreadsheets, running calculations by hand, building charts one at a time, and formatting reports from scratch.

Araca needed a platform that could eliminate this manual work entirely: ingest raw sensor data, compute domain-specific metrics automatically, surface patterns through interactive visualization, predict outcomes with ML, and generate publication-ready reports in one click.

Platform-first thinking

Rather than building a collection of scripts, I designed Araca Insights as a platform - a layered system where the data engine, user interface, and analytics modules are decoupled and independently extensible. The first release targets desktop (where researchers work offline in lab environments), with a web-based version planned to enable remote collaboration and broader access.

Desktop-first, web-ready architecture

Lab environments often lack reliable internet, and sensitive research data needs to stay local. Starting with a desktop app solved the immediate need while the underlying architecture is designed to support a web-based version for team collaboration and remote access.

WHY → meet users where they are today, scale to where they'll be tomorrow
Portable project format

Projects are fully self-contained and portable. Moving a project between machines or sharing with a colleague requires no reconfiguration - everything resolves automatically.

WHY → zero friction for collaboration and data sharing
Non-blocking UI with background processing

All data processing, metric computation, and report generation runs asynchronously. Users can continue working while bulk imports of 50+ experiment files or multi-chart reports are generated in the background.

WHY → responsive experience even with large datasets

What the platform does

Araca Insights animated scatter plot showing COF vs shear force with time-encoded color gradient
Animated feature scatter with time-encoded color reveals process evolution during polishing
Araca Insights key correlations dashboard showing Stribeck curves, COF vs PV, pad temperature, Arrhenius, Preston's, and WIWNU plots
Key correlations dashboard with domain-specific plots for multi-experiment comparison
Araca Insights compare files view with PCA clustering, Z-scores, category breakdown, and silhouette analysis
Multi-file comparison with PCA clustering, statistical analysis, and category breakdowns
Araca Insights ML prediction interface showing Ridge Regression model with predicted vs actual chart and feature coefficients
ML prediction interface with model diagnostics and feature importance analysis
Interactive Analytics
Multiple analysis views: time-series exploration, multi-experiment comparison, domain-specific correlation graphs, and ML-driven prediction. All interactive with configurable parameters and real-time updates.
ML Prediction
Multiple ML models with cross-validation and per-prediction uncertainty estimates. Diagnostic visualizations help users understand model confidence, feature importance, and when to trust the predictions.
Automated Reporting
One-click generation of publication-ready reports with embedded vector charts, executive summaries, and detailed data traces. Exports to PowerPoint and Excel for easy sharing with stakeholders.
AI Assistant
An LLM-powered assistant that provides contextual guidance on analysis workflow, model selection, and data interpretation. Designed to run locally so sensitive research data never leaves the network.

Where we are

100+ Hours of Manual Work Eliminated
1-Click Report Generation
4 ML Models
AI Assistant

The biggest lesson from this project is the value of meeting users where they are. Researchers don't want to deploy web apps or learn new tools - they want something that opens, works, and gets out of their way. Starting with desktop was the right call for adoption, and the architecture is ready to expand to the web when the team needs collaboration features.

The AI assistant is the most exciting part of what's next. Domain experts generate insights that ML models alone can't, and the assistant bridges that gap by helping users ask better questions of their data and understand when model predictions should be trusted or investigated further.