


Working with raw FASTQ or CRAM/BAM files means large volumes of data, and thus more time and money required to move them. Ordinarily, when running an analysis with input files from different cloud locations, there will be data egress when moving the data from one cloud location to the other for computation. You want to combine data from multiple distributed datasets and run GWAS analyses. Now suppose you are an academic researcher who wants to perform a genome-wide association study (GWAS) using both the public Kids First dataset which is on the Amazon US-east-1 cloud and datasets from the NHLBI Transomics for Precision Medicine Initiative stored on the Google US-east-1 cloud.

Save time and money by computing on data where it lives No other biomedical data analysis company can offer this degree of multi-cloud support for genomics research. Using the Seven Bridges Platform allows you to focus more on the science of your analysis, not on the logistics. avoid unnecessary data egress costs from moving your data around to run a computation in a different cloud or region from where the data is hosted.remain compliant when your desired dataset is not permitted to leave a specific geographic location.combine distributed datasets together for streamlined analysis with colleagues.The Seven Bridges Platform’s cloud-agnostic approach achieves this and empowers you to: Ideally, you would want to use one platform with support for multiple cloud locations so that from one user interface, you could find and access data on different cloud locations and then choose to run computation there where the data is stored. With most other biomedical data analysis platforms, there simply is not a streamlined method to do so, and it is costly to duplicate datasets and administration/billing systems to host them again in new cloud locations to run computations. You both want to log into the same platform, run analysis on the combined data, and then compute on results together in one location for future analysis. Meanwhile, your colleague in the Boston-based branch of the same company is storing their data in an eastern US cloud location. Suppose you are a researcher for a pharmaceutical company based in San Francisco storing data in a western US cloud location. But why is this “single pane of glass” approach so useful? Currently, the Seven Bridges Platform is featured on both Amazon Web Services and Google Cloud environments. The Seven Bridges Platform solves the distributed data problem by acting as a “single pane of glass:” a platform that can run computations on data distributed across cloud providers from one user interface.

A single pane of glass: one platform with support for multiple cloud locations And you want to save some time and money while you’re at it. You asked for the tools to make informed decisions about how and where to run your computation. So, how did we at Seven Bridges make this process easier? You told us you want to access datasets regardless of where they are stored, and to compute only on that data on its given cloud location. Do you have the time, money, and equipment to run computations on the data locally instead of in the cloud? As you can see, the information that you need to know before you can even begin a project can be difficult to find, and the sheer volume of details to keep track of is just another distraction from your research. On top of that, the logistics of downloading, storing, and running computations on datasets locally can be cumbersome, and the specter of data egress costs is a common concern. Specifically, which provider is hosting the datasets and in which regions? Beware, some datasets must remain in specific cloud locations due to policy restrictions, which may create another barrier between you and the data you need. First, you need to know where your datasets of interest reside. Starting a research project with data distributed in multi-cloud environments can be daunting. Empower your research with relevant datasets regardless of where the data lives The Multi-Cloud features on the Seven Bridges Platform allow you to work in a “cloud-agnostic” manner, enabling researchers to access and compute on datasets stored on multiple cloud locations to save time and money. Be Cloud-Agnostic: A Solution for Computing on Genomics Datasets in Distributed Cloud Locations
