New Startup Pumpkinseed Aims to Read the Proteome Letter by Letter, on a Chip
Pumpkinseed Technologies came out of stealth-funding mode this week with a $20 million Series A led by Future Ventures and NfX, capital it intends to spend on scaling a chip-based protein sequencing platform that, if it delivers, would put a real dent in mass spectrometry’s long monopoly on proteomics. At the least it could do the same for proteomics that nanopore did for sequencing: open me possibilities.
The Palo Alto-based company, founded in 2021 by Stanford materials science professor Jennifer Dionne with co-founders Jack Hu (CSO) and Nhat Vu, is building hardware around a deceptively simple premise: every chemical bond scatters light at a characteristic frequency, so if you can read those vibrations cleanly enough, at single-molecule resolution, you can identify any protein — including ones that aren’t in any database.
That platform is called deSIPHR — de novo Sequencing and Identification of Proteins with High-throughput Raman spectroscopy.
Decoding the Proteome Black Box
DNA tells you what a cell could do. Proteins tell you what it is doing. And yet, while whole-genome sequencing is now a commodity priced under $100 and single-cell transcriptomics is producing tissue-scale atlases, proteomics has lagged behind by an order of magnitude or more in scale and resolution.
The reason is partly chemical, partly historical.
DNA has a four-letter alphabet. Proteins, once you account for post-translational modifications, non-canonical amino acids, and glycan decorations, have something closer to a thousand distinct chemical monomers. Mass spectrometry — the field’s workhorse for two decades — handles this complexity by fragmenting peptides and matching the resulting mass profiles against reference databases. It works well, but only for what the database already knows. Anything novel, anything that doesn’t ionize cleanly, anything carrying an unusual modification, tends to fall through the cracks.
Edman degradation, the other classical approach, is even more constrained: slow, low-throughput, and blocked by N-terminal modifications.
The result is a curious distortion in modern biology. Researchers have generated extraordinarily detailed maps of a small, well-characterized fraction of the proteome — and almost nothing for the rest.
deSIPHR’s Non-Destructive aAproach
Pumpkinseed’s answer is to abandon database matching altogether and read amino acids directly from their physical signatures.
Raman spectroscopy has been around since the 1920s, but it has a brutal sensitivity problem: only about one photon in ten million scatters off a target molecule in a way that produces a useful spectral signal. That’s nowhere near enough for single-molecule work.
To compensate, deSIPHR uses a silicon photonic chip patterned with what the company describes as roughly 100 million nanoantenna sensors per square centimeter, or about a billion sensors per wafer. Each sensor concentrates incoming light into a sub-protein-scale volume, boosting Raman scattering efficiency by what Dionne’s group has previously published as roughly eight orders of magnitude — enough, in principle, to read individual amino acids from individual molecules.
Sequencing itself is done by subtraction. The chip captures a spectral fingerprint of a peptide, an enzyme cleaves the terminal residue, the chip captures the new fingerprint, and the difference reveals the residue that was removed. Repeat. According to its founders, the platform can currently handle peptides up to about 30 amino acids, with room to extend that.
The chips themselves are being manufactured on 300 mm wafers through a partnership with IMEC in Belgium — a serious choice that says the company is not interested in artisanal lab-scale fabrication. Pumpkinseed is also already collaborating with Genentech on identifying immunopeptides for oncology applications.
What it actually unlocks
If deSIPHR works at the throughput Pumpkinseed is promising, three things become tractable that are genuinely difficult today:
- De novo identification of unknown proteoforms. Post-translational modifications and non-canonical amino acids that mass spec misses become directly readable, because the system isn’t looking up the molecule in a list — it’s reading the bonds.
- Single-cell proteomics at scale. With a billion sensors per wafer and label-free detection, the floor for sample input drops, opening the door to single-cell molecular signatures rather than population averages.
- Training data for a virtual cell. This is the longer-game pitch. AlphaFold solved structure prediction once a sequence is known. The bottleneck for the next generation of biological foundation models is the input layer — high-resolution, modification-aware proteomic data straight from real biological samples. Pumpkinseed is positioning itself as the instrument that supplies it.
Parallelly, the company is also developing a complementary assay called cell-MAPP (cell monitoring across phenotype progression) that pairs sequenced peptides with functional readouts of T-cell activation — a clear bid at the immunotherapy and TCR discovery markets.
Where the skepticism lives
Yields and reproducibility at scale. A billion-sensor chip is an impressive number on a spec sheet. It is also a manufacturing nightmare. Sensor-to-sensor uniformity, defect tolerance, and run-to-run reproducibility are the kinds of mundane problems that have killed plenty of photonic biosensor startups before they reached real customers. The IMEC partnership helps, but until external labs publish data, this is a believe-it-when-you-see-it claim.
Sequencing-by-subtraction has its own biases. Enzymatic cleavage is sequence-dependent. Some terminal residues cleave cleanly, others don’t. The platform may end up with its own version of the “invisible to the database” problem — molecules that simply resist stepwise degradation.
Crowded competitive landscape. Pumpkinseed is not alone. Nautilus Biotechnology, Quantum-Si, Glyphic Biotechnologies, Encodia, and SemiconBio (formerly Roswell) are all chasing some flavor of next-generation proteomics with different physical principles — fluorescence, single-molecule binding, semiconductor-based electronic detection. Mass spectrometry isn’t standing still either; Orbitrap and timsTOF generations keep extending coverage. A $20M Series A is a solid platform-stage round, but it’s not Nautilus-scale capital, and the company will need to demonstrate clear performance advantages on real samples to win bench-level adoption.
Peptides, not full proteins. Reading peptides up to ~30 amino acids is a real capability, but it’s not whole-protein sequencing. The platform still depends on an upstream digestion step, with all the coverage and bias issues that implies.
The bigger frame
The pitch from Dionne and her co-founders is that proteomics is in roughly the same place genomics was before the Human Genome Project — a field with the right scientific questions but the wrong instruments. Whether deSIPHR turns out to be the analogous infrastructure shift, or just one more interesting Raman platform, will depend on what the chips actually do once they’re in customers’ hands.
For now, the technology is unusually concrete for an early-stage proteomics startup: a fab partner, a pharma collaborator, peer-reviewed underlying physics, and a clear architectural story for why the hardware should outscale incumbents. That’s a more credible starting position than most.
Links
- Company: pumpkinseed.bio
- Funding announcement: Series A press release
