Vector Space Biosciences & Oracle Develop Language Models to Advance Human Spaceflight

LA JOLLA, Calif.--()--Vector Space Biosciences today announced it is developing large language modeling (LLMs) and visualization applications in collaboration with the Oracle Cloud Infrastructure (OCI) team, to aid in the advancement of space biosciences. These Artificial Intelligence (AI) models will support the research and development of countermeasures against diseases associated with stressors connected to protecting and repairing the human body during spaceflight. By leveraging OCI’s AI-accelerated GPU clusters, optimized for deep learning and scientific computing, Vector Space Biosciences intends to innovate faster – accelerating the development of new applications, including new multi-omic innovations in drug repurposing, drug design, materials design and precision medicine.

For humans to establish a lunar base or go to Mars, understanding how to protect and repair the human body during spaceflight is necessary. Spaceflight includes a variety of stressors to the human body including microgravity and radiation in the form of Galactic Cosmic Rays (GCRs) and HZE particles, the high-energy nuclei component of GCRs. These stressors result in damage to the human body. Vector Space Biosciences, along with its technical and scientific collaborators including Oracle, Imperial College of London (ICL), University College London (UCL), Cal Poly Pomona, University of California San Diego (UCSD), IFO Rome, McGill University and others, will leverage advanced language modeling techniques – including context-dependent hidden relationship detection between proteins, pathways, drug compounds and molecular sequences – to enable new hypotheses, insights, interpretations and discoveries. This will include design and development of biological CubeSats launched from Vandenberg Space Force base initially into Low Earth Orbit (LEO) followed by deep space missions for the purpose of conducting biological experiments in microgravity and radiation. Data will be analyzed and interpreted using advanced language modeling and visualization.

Language models represent the leading edge in AI (Artificial Intelligence) and ML (Machine Learning) today. In collaboration with Oracle, Vector Space Biosciences can co-develop more complex, high-performance applications, including:

  1. The Correlation Matrix Dataset Builder (CMDB) API: A REST-based API enabling new ways of clustering context-dependent known and hidden relationships [1, 2] between proteins, pathways, drug compounds and molecular sequences in real-time based on the latest advancements in language modeling.
  2. The Protein-Protein Interaction Network (PPIN) API: A REST-based API which can be used to generate a multi-level graph network from a correlation matrix dataset. The graph network represents context-dependent known and hidden relationships between proteins, pathways, drug compounds and molecular sequences.

Language models and their vector representations (embeddings) are at the core of recent breakthroughs in AI, including AlphaFold2’s ability to predict the way a protein folds based on a sequence of amino acids, ProteinMPNN’s ability to design entirely new proteins from scratch, Chroma’s “DALL-E 2 for Biology” and BioNeMo, NVIDIA’s large language model’s (LLM) ability to generate, predict and understand biological data in new ways along with the recent release of OpenAI’s ChatGPT, a breakthrough language model for human dialogue. Languages can exist in many different forms – from a sequence of DNA/RNA, a string of amino acids, a series of nucleotides or molecular sequences to musical, mathematical, software or chemical notation – any sequence of symbols is a language, including human language.

“Vector Space Biosciences is ambitiously developing AI and modern language modeling to make crucial breakthroughs in biotechnology, and we’re honored to support their pioneering efforts to advance human spaceflight with our computational capabilities,” said Karan Batta, vice president, product management, OCI. “With the combined computing power of OCI and NVIDIA, Vector Space Biosciences is pushing the boundaries of AI, demonstrating its potential to test and develop key technologies that will advance the space sciences.”

Language models will continue to be core to additional applications currently being developed by Vector Space Biosciences, supported by OCI. Applications will span multiple industries in life sciences including biotechnology and pharmaceutical development including scheduled launches of biological CubeSats which will serve as a revenue generator. Correlation matrix dataset provenance and security is managed via the VXV wallet-enabled API architecture supporting utility token transactions signed on-chain.

Vector Space Biosciences will also utilize OCI Storage, OCI Compute, CPU, and networking capabilities to support its data platform capable of producing more than 100,000,000,000 different real-time datasets, along with MySQL Database Service. OCI’s bare metal compute instances powered by NVIDIA Tensor Core GPUs offer biosciences researchers an HPC platform for applications that rely heavily on language modeling, machine learning, and data-intensive jobs.

About Vector Space Biosciences, Inc.

Vector Space Biosciences, Inc., (SBIO), parent company of Vectorspace AI (VXV), along with its scientific collaborators, launch biological CubeSats for purpose of generating data related to microgravity and radiation. This data leads to the development of countermeasures against diseases associated to stressors connected to protecting and repairing the human body during spaceflight. This includes using a network of scientific data engineering pipelines for building targeted language models resulting in real-time datasets which power Artificial Intelligence (AI) operations in space biosciences, biotechnology and pharmaceutical development. Working with leading scientific labs in the areas of human aging, cancer, and nutrigenomics, our goal is to accelerate the process of new hypothesis generation and novel discoveries in space biosciences, including materials sciences in the area of nanotechnology and nanomedicines. Developing advanced large and small language modeling technologies, our platform is capable of producing more than 100,000,000,000 different real-time datasets for the purpose of accelerating discoveries. Innovations in space biosciences result in products and services for all industries, including the financial markets, more importantly, new forms of precision medicine for all humankind. Please visit us at or for more information.


Oracle, Java, and MySQL are registered trademarks of Oracle Corporation.