# Roman Bellisari

> Design engineer and AI/ML engineer based in New York City. Builds production AI systems, interactive media, and data platforms at the intersection of engineering, design, and machine learning.

## About

Roman Bellisari is a software engineer and designer based in New York City who works at the intersection of AI/ML engineering, full-stack development, UI/UX design, and creative technology. He is interested in building systems with lasting impact.

Over the past several years, he has worked across engineering and design, focusing on software, data, and machine learning systems. His work has included collaborations with startups, research studios, creative agencies, and larger organizations including Addition ML, McCann, Prudential, Realtor.com, MakeMePulse, Google, Cerberus Capital, Blackstone, IBM, McKesson, and others.

He currently works as a designer and engineer with a focus on AI/ML engineering, UI/UX design, and full-stack web development. He is especially interested in human-centered design, circular fashion, remote sensing, geospatial data systems, and compelling visual experiences.

In his free time he enjoys reading, running, baking bread, making music, and working with electronics.

### Contact

- Email: romanbellisari [at] gmail [dot] com
- LinkedIn: https://www.linkedin.com/in/romanbellisari/
- GitHub: https://github.com/romanbell
- Are.na: https://www.are.na/roman-bellisari/index
- Website: https://romanbellisari.com

### Technical Skills and Expertise

- Languages: Python, TypeScript, JavaScript, SQL
- AI/ML: LLM orchestration, multi-modal AI pipelines, computer vision, NLP, MLOps, generative AI
- Design: UI/UX design, design systems, interactive media, data visualization, computational design, responsive design
- Frontend: React, Next.js, TypeScript, Tailwind CSS, Framer Motion, GSAP
- Backend: Python, FastAPI, Node.js, Firebase, serverless architectures, microservices, Docker, Google Cloud Run
- Data Engineering: Apache Airflow, Docker, Azure, Google Cloud Platform, PostgreSQL, SQL data modeling, ETL/ELT pipelines, data warehousing
- Infrastructure: Google Cloud, Azure, Render, Vercel, Firebase, CI/CD, observability, fault-tolerant systems
- Domains: Generative AI, video generation, quantitative finance, healthcare technology, real estate technology, geospatial systems, urban data

---

## Work Portfolio

### Realtor.com AI Tours
**Software Engineering, Machine Learning, Video Systems**

Developed an AI-driven video generation engine in collaboration with Addition ML for Realtor.com to automate property and community storytelling at scale. The system synthesizes voiceover, imagery, mapping, and copy through a multi-modal ML pipeline integrating GPT-4o, Claude 3.5 Sonnet, ElevenLabs, GPT-4 Vision, SerpAPI, and Mapbox. Built deterministic rendering flows with intelligent image curation, geospatial overlays, and concurrent batch video creation. Deployed Dockerized FastAPI microservices on Google Cloud Run with Firebase orchestration, full MLOps observability, and fault-tolerant queue management.

The system automates the creation of property and community tours by combining machine learning, geospatial data, voice synthesis, and custom video rendering. It was designed to allow Realtor.com to scale storytelling across large inventories of listings, producing videos that feel consistent, informative, and visually coherent. The platform integrates text generation, image selection, mapping overlays, and a structured narrative framework.

The system works through several integrated components: AI script generation scrapes the listing's details and images, then generates a script that highlights key features of the home. Automated voiceovers narrate the script using AI generated voices. Dynamic music selection draws from a pre-vetted library. Image-driven storytelling selects and sequences listing images to match the AI-generated script. A flexible video length system uses dynamic components that allow each video to stretch or shrink based on the depth of information in the listing.

Beyond individual home listings, dedicated templates for new construction communities allow highlighting multiple floor plans, community amenities, and local insights where AI dynamically integrates top-rated schools, restaurants, and green spaces based on real-time local data.

Roman contributed backend engineering for deterministic video assembly, batch processing, model integration, and orchestration across cloud services. FastAPI microservices handled routing, concurrency, and state management, while a collection of multimodal models produced scripts, voiceovers, generated images, and location descriptions.

The system merges automation with editorial intent, making it possible to deliver story-driven tours across thousands of homes while preserving clarity, accuracy, and visual consistency.

---

### Prudential Flash Forward
**Software Engineering, Generative AI, Interactive Media**

Developed an interactive generative AI experience for Prudential's live events in collaboration with McCann via Addition ML, blending storytelling and personalization through GPT-4, Claude 3.5, ElevenLabs, and custom PyTorch/Hugging Face image models. Built real-time FastAPI services on Google Cloud Run with integrated photo booth capture, Firebase storage, and Twilio SMS delivery. Optimized few-shot prompt systems for sub-5 second narrative generation, supporting thousands of participants across national activations.

Flash Forward is an interactive generative media experience designed for live events. Participants enter a small booth, take a photo, answer a few questions, and receive a personalized narrative artifact within seconds. The system synthesizes text, voice, and imagery through a collection of custom pipelines built for low latency and high reliability.

Roman worked on the backend services, multimodal model coordination, asset generation, and real-time response flow. The system supported thousands of live participants with consistent performance, requiring aggressive caching, concurrency handling, and monitoring.

Featured in the Wall Street Journal.

---

### Patient Storytelling Platform for Organ Donations
**Software Engineering, Digital Storytelling, UX Design**

Built as a multi-modal narrative system in collaboration with a media agency and an AI advertising studio, the platform generates donor-recipient stories through text, image, and audio synthesis. It positions AI as a medium for care rather than efficiency.

The platform is a multi-modal narrative system built to help organ transplant patients communicate their stories with clarity, dignity, and emotional resonance. The tool guides someone through sharing their name, location, the organ they need, the motivations that keep them fighting, and the photos or videos that give those motivations texture. After a short voice memo, the system assembles everything into a fully composed outreach video that patients can review, refine, and share.

Roman designed and implemented the backend architecture, the API surface, the multimodal model integrations, and the data workflows that transform chat inputs into structured story components. The production pipeline orchestrates brainstorming prompts, content moderation, script assembly, image generation, video generation, and multi-language translation. It combines models such as GPT-4o, Gemini 2.5 Pro, Imagen, and Veo.

---

### Biography Narration Platform
**Software Engineering, AI Systems, Narrative Design**

Built a full-stack AI platform prototype in collaboration with an AI advertising studio and a creative agency for a major communications company. The tool generates dynamic biographies blending text, image, and voice. Designed and deployed multi-modal pipelines integrating Gemini 2.0, Imagen 3.0, and Google TTS to synthesize personalized narrative flows. Engineered scalable FastAPI microservices with Firebase authentication, Google Cloud Storage, and email delivery integration, supported by observability, content moderation, and EPUB automation for digital publishing.

The platform turns interviews, memories, and shared media into a living biography. Users respond to guided prompts, upload photos, and invite friends or family to contribute stories. The system shapes these fragments into chapters that read like a book, blending long-form text, images, and voice into a cohesive narrative.

The biggest challenge was maintaining a coherent voice and structure across biographies that often exceeded ten thousand words. A hierarchical prompting system was designed: first generating an intro paragraph and a rough chapter outline, then expanding sections based on topics extracted from interviews, and finally crafting an outro while parallel processes fetched image assets and audio narration.

---

### The Comma Project
**Punctuation Analysis, Data Visualization, Text Exploration**

An interactive web experience that uncovers the hidden patterns of punctuation in classic literature. By analyzing the top 50 most popular public domain books from the Gutenberg Project, the platform quantifies and visualizes how famous authors use punctuation, revealing the subtle patterns that define their writing style.

Built with Next.js, TypeScript, TailwindCSS, Framer Motion, and GSAP. Features web animations and a modern responsive UI. A custom Python pipeline handled tokenization, regex extraction, and pattern mapping, generating structured tables of punctuation counts, sentence lengths, character density, and rhythm variance.

Live at: https://commaproject.dev

---

### Sona Mecha
**Urban Sound Mapping, Sensor Networks, Data Visualization**

Explores how the sonic landscape of a city reflects broader social and environmental inequalities. The project treats sound as spatial data rather than background noise, proposing a new way of reading the city through the textures of its air, the hum of its infrastructure, and the rhythm of its people.

The project deploys a network of compact recording nodes to capture a complete acoustic field. A companion interface visualizes this data as a responsive 3D sound map where peaks and dips in intensity pulse in real time.

---

### Cartas.io
**Computational Design, Machine Learning, Research**

A research-driven application exploring how computer vision and generative modeling can accelerate garment prototyping and pattern design. Leveraging pose estimation, 2D-3D reconstruction, and diffusion-based synthesis, the project reimagines how designers engage with material form in digital space.

Live at: https://cartas.io

---

### Travel Itinerary Platform
**Backend Engineering, AI Systems, Scalable Pipelines**

Built an AI-powered travel planning platform over multiple years that generated structured, multi-day itineraries from minimal user input. Led backend engineering and system architecture, designing deterministic planning logic alongside LLM-driven generation workflows. Developed low-latency FastAPI services with aggressive caching, a custom SQL data model deployed on Render, and scalable ingestion pipelines for large-scale global activity data. Integrated geospatial services, multilingual generation, and multi-currency support. Worked closely with a small interdisciplinary team alongside engineering leaders from Stripe, Google, and Capital One.

---

### Ratatouille.nyc
**AI Recipe Generation, Serverless Architecture**

A web app that treats cooking as improvisation rather than instruction. Users type in whatever they have on hand and receive simple ideas shaped around their ingredients. Built on Firebase Authentication, Firestore, and Python services with a lightweight frontend on Vercel.

Live at: https://ratatouille.nyc

---

### Cerberus Capital Management
**Machine Learning, Data Science, Alternative Asset Management**

Machine learning engineer focused on quantitative systems across private equity portfolios, building production ML workflows, attribution models, and large-scale data platforms to support commercial strategy, performance measurement, and operational insight.

Key work:
- Designed and deployed a multi-touch attribution model evaluating digital marketing performance across channels, supporting A/B testing and budget reallocation
- Developed ML pipelines to classify, normalize, and aggregate enterprise spend data using Python, Apache Airflow, Docker, and Azure
- Contributed to a large-scale data warehouse consolidation effort spanning hundreds of source tables

---

### Blackstone Portfolio Company
**Software Engineering, Quantitative Modeling, Credit Markets**

Software engineer working directly on quantitative systems supporting a globally distributed lending portfolio across multiple borrowing facilities and currencies. Built and maintained models and pipelines used to monitor global liquidity, concentration exposure, lender covenants, and credit risk in real time.

Automated credit risk and liquidity reporting using Python and SQL, replacing manual workflows with deterministic models tied directly to market data. Built serverless APIs and scalable data pipelines feeding real-time metrics into reporting and forecasting systems.

---

### McKesson Sales Permissions Architecture
**Data Engineering, Systems Design, Cross-Functional Collaboration**

Designed and implemented a complex SQL-driven permissions hierarchy for a national sales organization, integrating multiple data sources across legacy and modern environments. Built a multi-environment SQL pipeline with formalized build procedures, version control integration, scheduled deployments, and strict separation of environments.

---

### Early-Stage MedSpa Data Platform
**Data Engineering, ETL Systems, CRM Architecture**

Designed and implemented the core data ingestion and transformation layer for an early-stage platform serving medical spas. Built a configurable ETL system capable of aggregating operational, appointment, and revenue data from multiple inconsistent sources.

---

### Additional Projects

- **Undisclosed Media**: Hypertext project and first full-stack MERN application — a collection of personal media connected to other people's work (https://github.com/romanbell/undisclosed)
- **Audio Classifier**: Custom ML models for audio genre classification using MFCC transformations and MLPs (https://github.com/romanbell/Audio-Analysis)
- **Amazon Reviews Prediction**: NLP sentiment analysis on 1M+ Amazon reviews (https://github.com/romanbell/Amazon-Reviews-NLP)
- **Nnector**: Full-stack web application for real estate project contractor bidding, built with a team of 5 engineers over a year
- **Calendar Design for Pat**: Print layout and visual design for a community fundraiser honoring a NYC bike mechanic

---

## Creative Work

### Photography
A curated photography portfolio available at https://romanbellisari.com/photography

### Artwork
Drawings and sketches in pigment ink, charcoal, and ballpoint pen. Available at https://romanbellisari.com/artwork

### Objects
Physical builds including vintage electronics restoration, motorcycle builds, and fabrication work. Available at https://romanbellisari.com/objects

### Bread
A sourdough baking project documented since 2023. Available at https://romanbellisari.com/bread