
DietSkan
Automatic ingredient-level nutrient estimates from a single photo or short video
Manual food logging is time consuming and systematically biased. DietSkan’s goal is to produce near real-time estimates of calories and macronutrients from a simple smartphone capture, so researchers and consumers can track dietary intake quickly and with less burden. By turning a single photo or short video into ingredient-level nutrient estimates, DietSkan aims to support diabetes management, everyday health goals, and rigorous dietary assessment without scales, barcode scans, or long food diaries.
01
Project Overview

DietSkan is a vision-first system that produces ingredient-level calorie and macronutrient estimates from a single photo or a short smartphone video. The project responds to the well documented limitations of self-reported intake and the need for scalable, objective measurement in clinical and population research.
​
The pipeline follows a camera-first flow: capture an image or short multi-view clip, segment and classify food items, estimate portion size or mass, then map those measurements to standardized food composition tables. Models are trained on curated RGB datasets, including Nutrition5k, and are supplemented with in-lab collections and careful label curation so that mass and macro predictions are reliable and comparable across studies.
​
Outputs include per-item confidence or uncertainty scores that can trigger a quick re-capture or user prompt when needed. The system is designed for practical use on standard smartphones and for integration into clinical and research workflows, as well as consumer-facing applications.
02
Significance
Dietary self-report is persistently inaccurate, especially for mixed dishes, snacks, and eating outside the home. This inaccuracy limits both day-to-day self management and the quality of data collected in clinical and population studies. At the same time, most existing image-based systems either assume tightly controlled capture conditions, require extra hardware, or do not expose uncertainty in a way that is useful for researchers and clinicians.
​
DietSkan addresses the practical problem of producing reliable, ingredient-level nutrient estimates directly from smartphone vision. The system must identify ingredients in complex, mixed dishes, estimate portion size or mass with usable accuracy, and provide uncertainty-aware nutrient estimates that remain robust to everyday lighting, occlusion, and casual capture.
​
By reducing user effort to a single photo by default, with an optional short video for higher accuracy, DietSkan aims to make objective dietary assessment accessible to people managing diabetes and other chronic conditions, and to researchers who need consistent, scalable intake measures across large studies.

03
Approach
Camera capture and data flow
-
One quick photo by default, with an optional short multi-view clip that provides better cues for scale.
-
A camera-first pipeline that converts RGB frames into segmented food items, estimated portion size or mass, and mapped nutrients using standardized composition tables.


AI-driven vision-first estimation engine
-
Strong detectors and segmenters, including transformer-based components where helpful, to find ingredients and small items in mixed dishes.
-
Direct estimation of portion size (mass or volume) from images, with short handheld video used to resolve scale without any extra hardware.
-
Per-item confidence or uncertainty scores that allow the system to request a quick re-capture or apply simple checks to avoid large errors when confidence is low.
Detection, temporal linking, and post-processing
-
Detection and segmentation of each visible item, with a lightweight tracker that links items across frames for temporal stability in short videos.
-
Ingredient-aware classes and an explicit focus on small items to reduce misses, and the use of basic text cues where available to correct ambiguous labels.
-
Mapping of items to food composition tables, adjustment of labels when needed, and output of per-ingredient mass and nutrient estimates.



Practical deployment and evaluation
-
Minimal user steps: one photo by default, short video optional when the user or system wants higher accuracy. Core functionality runs on device, with an optional server step for heavier tasks.
-
Validation on Nutrition5k and an in-lab dataset, with reporting of per-ingredient error and a reproducible evaluation protocol for volume and calorie estimation that can be reused in future work.
Publications
In progress
Reference
Add a Title
Reference
Ackhnowledgements
-
Text
