How can I use HELIX?

HELIX aims to make easier for you to discover, share, and use scientific data.

What follows is only a summary of our services. Look into our guides and tutorials for more information, or join us in one of our open events to learn more about how you can use HELIX.

Discover data. You can search for any data that might interest you, learn more information about its lineage, see its licensing terms, and add it to your collection for later use. HELIX hosts data directly produced and shared by scientists, but also open data provided by public or private organizations in a single place.
Use data. You can download any dataset you want and start using it directly! The samples and visualizations can help you evaluate easily whether a dataset will cover your needs. HELIX also provides several advanced data services which you can use!
Publish data. You can share any data you have produced, or you are using, or you simply find interesting with others, regardless of their size and type. You only need to upload the data along with some basic information (metadata) to help others discover and use them. You can also link your data with your publication, allowing others to easily discover both of them.
Cite data. Any data you discover and use are assigned a permanent unique identifier, which you can use to cite in your publications. This works the same way the typical publication identifiers, such as DOI work. Using a permanent identifier for datasets means that they will always be accessible.
Data services. Most of the data provided by HELIX are ingested, managed, and served by highly scalable cloud data engines, negating the need to download the data, address their size/complexity, integrate them in other applications, install and maintain computing infrastructures, etc. You can enjoy blazing-fast, highly scalable complex data processing and analysis through your browser, give it a go!
- Jupyter (open beta). Anyone familiar with statistics, machine learning, and data wrangling in general, is familiar with the Jupyter notebooks. They allow you to easily experiment with data, train models, and collaborate with others over multiple languages, like Python or R, and even tap into our HPC infrastructure. HELIX provides Jupyter as a hosted service, with streamlined access to published data, and over highly scalable execution environments. No need to download anything, manage Big Data collections, or perform mundane admin tasks. You can work from anywhere, using even a small tablet.
- Zeppelin (closed beta). Working with Big Data processing environments, such as Apache Spark, is quite challenging. Setting up, scaling, and managing the underlying computing infrastructure requires sizeable time and resources. Apache Zeppelin brings the paradigm of Jupyter to Big Data, offering web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala, and Python. HELIX allows you to tap into ready to use Big Data frameworks and data collections, allowing you to focus on your research challenge.
- Domain-specific. HELIX provides several other ways to discover, query, analyze, visualize, and integrate scientific data depending on their specific type or intended application. These facilities include charts, interactive maps, or even REST-ful APIs for data processing. The type and number of these facilities is constantly increasing, as we will be developing services for high-impact and mature communities.
Discover publications. You can search for Open Access publications published by Greek organizations and researchers across Europe! All publications have been harvested from OpenAIRE, EU’s Open Access initiative in which we actively participate and contribute. If your organization already operates an institutional repository indexed by OpenAIRE, then all of your publications are also available through HELIX.

Services for my Organization

HELIX can serve as your institutional scientific data repository or scientific infrastructure, helping you increase the productivity and outreach of your scientific output. There are many ways you can join us, both leveraging your existing infrastructures, and minimizing your future investments for data intensive infrastructures.

Link your data catalogue. If you already have a scientific or an open data catalogue in place, HELIX can harvest its contents and make your data available to the tens of thousands of scientists and researchers that visit HELIX daily. HELIX can help you increase the visibility of your research output, facilitate sharing with external research teams, and allow more users to discover and use your data. HELIX can harvest data from practically any data catalogue and repository, due to its open-standards policy.
- Beyond that, you can opt-in for additional services increasing the value of your data. HELIX can maintain copies of your data in its own repository, lowering the burden in your computing infrastructures. Moreover, HELIX can ingest your data in its own data engines, offering them via its data services and APIs to the scientific community at large.
- At any point in time you can dictate your integration level with HELIX, change it, evaluate the potential savings and impact for your organization, and follow a streamlined transition pathway for migrating your existing services to HELIX.
Data repository. HELIX provides you with a free, highly-scalable, Open Access-compliant scientific data repository, covering all data publishing and curation needs of your organization.
- Your data catalogue is available under a sub-domain of your choosing, with your data also being discoverable through the main HELIX catalogue.
- Your scientists, researchers, and students are automatically identified and authorized by HELIX as members of the national scientific community, and have immediate access.
- You can appoint one or more administrators responsible for managing your users and implementing your Data Management Plan. By the way, if you don’t have a DMP in place, we can help you setup one!
- Your scientists can start publishing immediately! For any questions, they can contact our Helpdesk, consult our how-to guides, or participate in our training courses.
Labs. The power of Jupyter notebooks as a learning resource and scientific instrument, is at your hands. All your members, from researchers to undergraduate students, have tiered access to our Labs section, allowing them to analyze, test, and experiment with data within seconds.
- Your scientists can tap into the large-scale computing infrastructures of GRNET, ranging from Apache Spark clusters, to HPC. Their experimental data can be uploaded on demand, or be managed by HELIX directly, minimizing the effort and time spent preparing them. Collaborating with other individuals and research teams on the same data and notebooks is available out of the box, maintaining however full control over who, when, and why has access to your data.
- Jupyter notebooks are powerful learning instruments for statistics, machine learning, data management, and Data Science in general. With HELIX, you can provide your postgraduate and undergraduate students with bundled datasets and assignments, even organizing full courses. You only need to provide the data (optionally also making them publicly available), notebook templates (optionally, your students can start from an inbuilt template) and analysis goal. The notebooks are safely stored, submitted, and tracked allowing you to judge your students’ progress and accomplishments.
Project-specific data infrastructure. In case you need to comply with specific Data Management Policies for individual research projects, but are not ready to introduce all your members in HELIX, this option provides the best of both worlds. HELIX can provide you with a project-specific data catalogue, repository and infrastructure, implementing your required Data Management Policy, ensuring secure access only to members of your project. All data produced, provided, and evaluated are managed under your full control, allowing you to selectively publish them or keep them private.
Software as a Service. The complete HELIX infrastructure is available as a Service, allowing you to provide a publications and data repository to your members, along with services for experimenting and using data, without the need to purchase and deploy a new infrastructure. It is the most cost-effective and low-maintenance pathway for supporting data-intensive research at your organization. We host the system under your own domain, handle the day-to-day administration tasks, and support to your members.

Services for my Research Infrastructure

HELIX has been designed and developed as a horizontal foundation for other Research Infrastructures, offering cost-effective, highly-scalable, and secure access to data management and processing services. In this context, HELIX is a building block of domain specific infrastructures, addressing their data-intensive requirements, harnessing network effects, and maximizing national investments in research.

Currently, in Phase 1, we provide support to select Research Infrastructures, as we are still developing HELIX, and operate under constrained resources. You can expect more Research Infrastructures to tap into HELIX in the upcoming years, just look for the ‘Powered by HELIX’ logo!