Araca Insights - Garegin Mazmanyan

The Problem

Semiconductor wafer polishing is a precision process where small variations in force, temperature, and friction determine whether a wafer meets spec or gets scrapped. Engineers were spending 100+ hours per study manually processing sensor data - copying numbers between spreadsheets, running calculations by hand, building charts one at a time, and formatting reports from scratch.

Araca needed a platform that could eliminate this manual work entirely: ingest raw sensor data, compute domain-specific metrics automatically, surface patterns through interactive visualization, predict outcomes with ML, and generate publication-ready reports in one click.

Approach

Platform-first thinking

Rather than building a collection of scripts, I designed Araca Insights as a platform - a layered system where the data engine, user interface, and analytics modules are decoupled and independently extensible. The first release targets desktop (where researchers work offline in lab environments), with a web-based version planned to enable remote collaboration and broader access.

Desktop-first, web-ready architecture

Lab environments often lack reliable internet, and sensitive research data needs to stay local. Starting with a desktop app solved the immediate need while the underlying architecture is designed to support a web-based version for team collaboration and remote access.

WHY → meet users where they are today, scale to where they'll be tomorrow

Portable project format

Projects are fully self-contained and portable. Moving a project between machines or sharing with a colleague requires no reconfiguration - everything resolves automatically.

WHY → zero friction for collaboration and data sharing

Non-blocking UI with background processing

All data processing, metric computation, and report generation runs asynchronously. Users can continue working while bulk imports of 50+ experiment files or multi-chart reports are generated in the background.

WHY → responsive experience even with large datasets

Gallery

Animated scatter plot showing COF vs shear force with time-encoded color gradient

Key correlations dashboard showing Stribeck curves, COF vs PV, pad temperature, and WIWNU plots

Compare files view with PCA clustering, Z-scores, and category breakdown

ML prediction interface showing Ridge Regression with predicted vs actual chart

Animated feature scatter with time-encoded color reveals process evolution during polishing

Implementation

What the platform does

Interactive Analytics

Multiple analysis views: time-series exploration, multi-experiment comparison, domain-specific correlation graphs, and ML-driven prediction. All interactive with configurable parameters and real-time updates.

ML Prediction

Multiple ML models with cross-validation and per-prediction uncertainty estimates. Diagnostic visualizations help users understand model confidence, feature importance, and when to trust the predictions.

Automated Reporting

One-click generation of publication-ready reports with embedded vector charts, executive summaries, and detailed data traces. Exports to PowerPoint and Excel for easy sharing with stakeholders.

AI Assistant

An LLM-powered assistant that provides contextual guidance on analysis workflow, model selection, and data interpretation. Designed to run locally so sensitive research data never leaves the network.

Results

Where we are

100+ Hours of Manual Work Eliminated

1-Click Report Generation

4 ML Models

AI Assistant

Reflection

The biggest lesson from this project is the value of meeting users where they are. Researchers don't want to deploy web apps or learn new tools - they want something that opens, works, and gets out of their way. Starting with desktop was the right call for adoption, and the architecture is ready to expand to the web when the team needs collaboration features.

The AI assistant is the most exciting part of what's next. Domain experts generate insights that ML models alone can't, and the assistant bridges that gap by helping users ask better questions of their data and understand when model predictions should be trusted or investigated further.