Research Scientist - Vision Data Infrastructure Job at Storm3, San Francisco, CA

SzM2RVFxeWltUDhJNjlNK0k3MFpYRWpzTVE9PQ==
  • Storm3
  • San Francisco, CA

Job Description

Research Scientists/Engineers (all levels)

🔍 Focus on Vision Data Infrastructure

🤖 Fundamental AI Research Institute

🌎 San Francisco Bay Area, USA

💸 $250,000 - $600,000 salary + annual bonus

Come join one of the only research institutions globally with resources to compete with top AI companies =>10s of 1000s of GPUs to explore state-of-the-art research in LLMs, Multimodal and Agentic AI.

Currently seeking AI talent with expertise in building scalable pipelines for vision data to support both image/video generative training and multi-modal alignment. You’ll design high-performance pipelines for large-scale image and video datasets , enabling efficient pretraining, alignment, and simulation-based data generation.

Responsibilities:

Vision Data Sourcing & Curation

  • Collect and organize image and video data from open datasets and the web.
  • Handle data cleaning, filtering, deduplication, and metadata generation.
  • Ensure ethical and compliant data collection at scale.

Processing & Augmentation

  • Build high-throughput pipelines for vision data preprocessing (frame extraction, resolution normalization, format conversion, latent caching).
  • Implement GPU-accelerated augmentation and distributed data loading (WebDataset, TFRecords, Parquet).

Synthetic & Simulation-Based Data Generation

  • Use simulation tools (e.g., Unreal Engine 5 , Isaac Sim, Unity) to generate high-quality synthetic vision data .
  • Create specialized datasets for VLM training , visual reasoning , and agent interaction .

Requirements:

  • Strong experience with data engineering , computer vision , or machine learning infrastructure .
  • Expertise in building and scaling ETL/data pipelines for large unstructured datasets.
  • Proficiency with Python , PyTorch , and distributed data frameworks (e.g., Ray , Spark , Dask ).
  • Experience with WebDataset , TFRecords , Parquet , or similar high-throughput data formats.
  • Familiarity with GPU-accelerated preprocessing , NVIDIA DALI , or equivalent systems.
  • Understanding of image/video codecs , data compression , and cloud storage optimization .

Preferred Experience:

  • Prior work with simulation-based or synthetic data generation using Unreal Engine , Isaac Sim , or Unity .
  • Experience curating datasets for multimodal or vision-language model training.
  • Knowledge of data ethics , privacy , and compliance frameworks for large-scale AI datasets.
  • Experience contributing to open datasets or data-centric AI research .

Why apply:

  • Opportunity to join a fast-growing core team that are already pushing AI breakthroughs
  • Highly competitive salary package
  • Work alongside ambitious and bright superstars from tech and academia
  • Medical, Dental and Vision Insurance
  • Relocation package available

🌎 San Francisco Bay Area, USA

📧 Interested in applying? Please click on the ‘Easy Apply’ button or alternatively email me your resume at stefani.lukic@storm3.com

Job Tags

Relocation package,

Similar Jobs

LeChase Construction

CDL A Truck Driver Job at LeChase Construction

 ...SUMMARY Performs a variety of duties as directed, which includes pulling materials and preparing orders for delivery or pick up. The Driver is also responsible for the safe operation of a commercial vehicle and the safe delivery and unloading of materials, supplies and/... 

Nulixir Inc.

Office Assistant Job at Nulixir Inc.

 ...environment where no two days are the same? Nulixir, a leader in food and beverage innovation, is looking for an exceptional Office Assistant to join our team. This is your chance to be part of a cutting-edge company that's transforming the industry with groundbreaking... 

RPM Living

General Manager- High Rise Job at RPM Living

 ...The General Manager will oversee all business operations at one or more of our apartment communities, including a Class A or high-rise property. The successful candidate will possess strong interpersonal and resident relations skills, and will have the ability to communicate... 

Aramark

Lead Cook - CFA RH - University of South Carolina Job at Aramark

The Lead Cook is responsible for cooking and preparing food using standard recipes and production guidelines while following food safety, food handling, and sanitation procedures. The individual in this role should safely handle knives and equipment including grills, fryers...

Burnett Specialists Staffing | Recruiting

Customer Service Officer Job at Burnett Specialists Staffing | Recruiting

 ...Customer Service Officer Our client is a global marine lubricants organization supporting international vessels calling on North American ports. This role is a critical frontline position responsible for ensuring seamless order fulfillment, rapid customer response, and...