StorageReview.com

Podcast #145: Why Edge AI Matters with Oregon State University

Enterprise

If you follow StorageReview regularly, you might have seen the piece on the OSU Ocean Study in real time. We spoke with Chris Sullivan, Director of Research and Academic Computing at Oregon State University, to better understand how technology helps researchers study ocean life and how that impacts the global environment.

Brian met up with Chris at SC25 to discuss the topic further. It’s rare for us to write about how technology is used in real-world situations such as OSU’s plankton research, so this feature and the podcast gave us that opportunity.

As Brian and Chris discuss the research being conducted at OSU, there’s genuine excitement about how technology is advancing to support ongoing oceanic studies. Chris has been with Oregon State University for over 24 years and has served as director of biocomputing for the past three years.

This live podcast showcases Chris’s excitement for science and technology. He called SC25 “candy land.”

This is a lively and honest discussion (with some humor) that provides genuine insights into how research uses technology to deliver real-time results.

It’s a short podcast worth listening to, but if you’re pressed for time, we’ve provided a transcript that lets you jump to the most relevant sections.

0:00 – 5:00

Turning a Research Vessel into a Floating Supercomputer

Chris Sullivan from Oregon State University explains how new tech is fundamentally changing how ocean science is done, especially massive plankton studies run from ships that are basically small, floating data centers.

  • Technology isn’t just “cool gear”; it directly changes what science is possible and the extent of bias in results.
  • Scientific bias: limited or skewed samples leading to inaccurate conclusions; the cure is more data points across more locations.
  • Goal: Collect huge, diverse datasets so statistical analyses aren’t distorted by sampling just one spot or one set of conditions.
  • Traditional HPC setups forced a trade-off: fast local NVMe/SSDs with limited capacity vs. large but slower tier-one storage systems.
  • For large experiments, Chris had to depend on tier-one storage across the cluster because on-node (tier-zero) storage was too small to hold massive datasets.
  • Big, shared-storage “super pods” became necessary to keep GPUs fed, but they’re expensive and not realistic to deploy in the field or at sea.

5:00 – 10:00

Ocean Robots, Plankton, and a Million-Dollar Cruise

This section dives into what actually happens on the ship—how they tow instruments, capture staggering amounts of image data, and why semi–real-time processing is critical when ship time costs about $1M per 10‑day cruise.

  • The new NSF-backed research vessel is built like a “mini Star Trek Enterprise,” packed with tech and about 200 miles of cabling on a 200‑foot boat.
  • Previously, Chris had to roll his own compute hardware onto the ship, string fiber down hallways, and operate without a proper on-board data center.
  • The plankton system: a towed device moves up and down through the water at 5 knots, pulling ~162 liters of water per second and generating ~10 GB of image data per minute.
  • A single 10‑day plankton experiment: ~100 TB of raw data; after processing, segmentation, and classification, that expands to ~300–400 TB of usable data.
  • They must process this in semi–real time; otherwise, a $1M, 10‑day cruise could return with bad or useless data and no opportunity to redirect the ship.
  • Plankton are mostly at the mercy of currents (not strong swimmers), yet they’re crucial: ~50% of Earth’s oxygen, ~25% of the carbon sink, ~17% of global food per capita, and the base of the marine food web.

10:00 – 15:00

Edge AI at Sea and in the Forest

The focus shifts from the ship to the broader “edge” problem: how to push compute closer to where data is created—on boats, in ranger stations, and across forests filled with autonomous sensors.

  • New high‑capacity SSD-based systems let them build a true tier zero on a single device: store hundreds of TB of data, process in semi‑real time, and never move it again.
  • Earlier tests with smaller NVMe drives required constant offloading to make room, which burned precious IO bandwidth and slowed the science.
  • There is essentially no dedicated IT staff on the ship; one person runs the ship systems and stays away from experiments, so grad students and postdocs must manage hardware and pipelines themselves.
  • This drives a design requirement: simple, robust pipelines and self-contained devices—no fragile web of arrays, mounts, and volumes.
  • At the forest edge (e.g., the Owl project), they run ~5,000 autonomous recording units and can identify ~130 bird species by sound alone.
  • The pain point: humans still have to go collect SD cards; by the time the data returns, much of the temporal value is gone. Edge processing aims to extract insights immediately at the point of capture.

15:00 – 20:00

Matching the Right GPUs and Storage to the Job

Here, they unpack GPU strategy, Omniverse-based training, and why large-capacity flash is as important as raw compute for AI and scientific visualization.

  • Chris stresses using the proper GPU for the job: large, expensive accelerators (e.g., H100/Hopper, “grasshoppers”) should focus on training, while more cost-effective cards like RTX 6000s handle inference at scale.
  • Historically, there was confusion: too many overlapping GPU SKUs, unclear “swim lanes,” and a lot of money wasted on overkill hardware for simple inference.
  • RTX 6000s give them a dual role: high-performance inference plus graphics and rendering capabilities that data-center-only GPUs lack.
  • They’re building an Omniverse digital twin of the ship to:
    • Train students and postdocs virtually while the ship is at sea.
    • Plan physical layouts and equipment loading in tight, moving spaces.
    • Avoid wasting expensive ship time on basic training and trial‑and‑error setups.
  • On a moving ship, anything with moving parts is a risk; solid-state storage is preferred for reliability and resilience.
  • High-capacity SSDs in a small power envelope free up power budget for more GPUs, reduce the need for separate arrays, and shrink the total hardware footprint.

20:00 – 25:00

From Filing Cabinets to Fiber-Connected Oceans

Zooming out, how cheap storage unlocked modern AI, how ocean observatories stream data ashore, and why network and storage together are enabling new kinds of science.

  • Chris argues AI’s real unlock was not new math—most models date back to the 1950s–70s—but the cheap hard drive of the 1990s, which moved data from filing cabinets onto searchable, computable media.
  • In the “filing cabinet era,” they had tons of data but were “information poor”—no search, no redundancy, and no scalable processing.
  • There’s always been an arms race between processing speed and data availability; large-capacity SSDs now tip the balance again by feeding modern GPUs with far more data.
  • Oregon State is the cyberinfrastructure lead for the Ocean Observatories Initiative (OOI), a billion‑dollar NSF/WHOI project that uses gliders, buoys, and seafloor mounts to stream ocean data.
  • They hold ~9 PB of OOI data, with a fiber backbone upgraded from 200 Gbps to 400 Gbps via Link Oregon, enabling real-time data transfer from the ocean to the data center.
  • Even with Starlink on the ship and on land projects (like the owl work), satellite bandwidth and cost limit what can be sent, so they’re testing LoRaWAN and other mesh networks to push data back to a single, Starlink‑connected ranger station.

25:00 – 27:52

Lessons for Enterprise: Your Data Center Is the Edge Now

The conversation closes by translating these research lessons to enterprises and highlighting the hardware Chris would “steal” from the show floor.

  • Core message to enterprises: if you’re not processing at the edge, you’re burning time and labor hauling raw data back and forth. And missing real-time opportunities.
  • Edge AI and agents should preprocess, filter, and analyze locally to redirect human effort from logistics to higher‑value tasks and decision-making.
  • Real‑world analogs:
    • Retail uses edge analytics in stores to track traffic, make recommendations, and deliver dynamic experiences, much like forestry uses edge inference to protect endangered species while enabling logging.
    • Fisheries want “smart traps” and buoys so they only send out boats when there are actually crabs in the pots.
  • For Chris, the ideal pattern is on-premises data centers that become edge facilities integrated with the cloud for large training runs and longer‑term workloads.
  • Favorite hardware from the show: Dell’s PowerEdge 7745; he wants more of them for VR labs and field deployments because they strike the right balance of power and footprint, flexibility, and capabilities (rendering + inference + edge work).
  • He sees systems like the 7745 as the template: flexible enough to live in a data center or in harsh field conditions, supporting both traditional HPC and advanced AI/visualization at the edge.

Engage with StorageReview

Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | TikTok | RSS Feed

Harold Fritts

I have been in the tech industry since IBM created Selectric. My background, though, is writing. So I decided to get out of the pre-sales biz and return to my roots, doing a bit of writing but still being involved in technology.