Research Data Management Services Unit
Scientific research is increasingly more digital, with expanding volumes of data that necessitates good data management practices. Our role is to provide information, consultation, support, and training to researchers through all phases of the research data life cycle (Planning, Data Collection, Management and Analysis, Preservation and Sharing).
We provide advice and support in the following areas
Note: This site is still under construction. Feel free to get in touch for any suggestions or inquiries
I am the Research Data Management Project Manager. Previously, I worked as a Data Scientist in KPMG AG and was a research assistant at Middle East Technical University for about 10 years.
My research areas are genetic and health data management; design of genetic data included Personal Health Record systems; privacy and security of Electronic Health Records; Personal Data Protection laws and regulations.
I have had experience in privacy and confidentiality of health and genetic data, data analysis methods, machine learning algorithms and IT infrastructures.
25th of January 2021, Electronic Lab Notebooks & Survey results
15th of October 2020, Introducing the Research Data Management Unit
Please note that the calls are NOT recorded in order to promote free discussions.are only available in the intranet.
If you would like to suggest a topic please.
- is only available in the intranet.
- Research data management (RDM) open training materials on
RDM services in other organisations:
If you need to reach us you have many choices. You can fill aor contact us via:
Join our mailing list on specific topics:
Mailing list for MDC employees interested in participating in a working group on data sharing:
Mailing list for MDC members interested in participating in a working group on Electronic Laboratory Notebooks (ELN):
Phone: +49 30 9406-3100
Research Data Management
Accountability and Integrity
Ensuring data is preserved
Minimizing risk for data loss
Ensuring data is findable and easily discoverable, increased time-saving.
Data can be used and reused, avoid duplicated effort
Encouraging collaboration, research is more global requiring collaborative data sharing!
Minimizes time and effort preparing for publication and open data requirements
Increased recognition, citation, and scholarly impact
Ensuring intellectual property rights are preserved
Compliance with funders requirements and institutional policies
Ensuring public funds are well spent and utilized maximizing its value
To ensure scientists have what they need to do good science and to create good quality data
Helps you answer the big question “ I have big data now what can I do with it?”
Data & Metadata:
Any type of information that is collected, observed, or created, in the context of research, as such, data can be;
- Primary- Raw from measurements or instruments
- Secondary- Processed from secondary analysis and interpretations.
- Published- final format available for use and reuse
- Metadata- data about your data
Metadata is independent data that contain structured information about other data, i.e. Data about data. In other words, it is the data that provides essential context and relevant information about how the data was created, stored and shared.
For instance, metadata can describe various aspects related to an experimental procedure such as; who carried out the experiment, which parameters were chosen, what type of equipment was used, the output and results, how the results were analysed, shared and used.
- It ensures reliability, accessibility and discoverability
- Increases the value of your data
- Reduces duplication efforts
- Allows us to track people,institutions or publications associated with the original research
- Enables researchers to quickly assess the quality and relevance
- Metadata is frequently required for depositing data in repositories.
Descriptive metadata: Information outlining basic facts necessary for discovery and identification, i.e. title, authors, keywords and abstract,
Structural metadata: Information regarding the structure (organisation and relationship) of a data and underlying items. For instance it could be a description of enclosed files and scripts, how they are organized, and structured and how they are related and where they can be found i.e. DOI
Administrative metadata: Information that describes the technical information and information regarding management of the data including, licensing and copyright permissions, technical requirements, file formats, provenance (i.e. history of ownership, who owns the data and where did it come from), access and sharing controls and permissions, quality controls and integrity checks.
- Metadata standards enable the structuring of metadata and enhance its interoperability, by using common terms and definitions, to provide consistency and accuracy to data documentation.
- Metadata standards offer technical standards that ensure units of measurement, time, are entered in controlled formats, i.e. date and time formats
- The standards can be discipline specific or general such as , DataCite Metadata Schema, Data Documentation Initiative () and International Standards Organisation ().
- Example of metadata standards and tools for lab-based research:
- ISA framework and tools:
- Minimum Information for Biological and Biomedical Investigations:
- Examples of metadata standards for software:
- The Software Ontology
- PROV Ontology (PROV-O)
- A text or html document.
- An XML document linked to data files
- Information embedded in an XML data file
XML (eXtensible Mark-up Language) files includes key data and metadata documentation that is interoperable for web browsers and analysis engines which in turn enables field specific searching.
- Use ELN to record your work
- Use versioning controls to track history, progress and changes in a descriptive manner
- Use metadata standards
- Use README files
Open data is data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and share-alike.
- Open Knowledge Foundation
FAIR is a set of principles to define the best practices for data and software to facilitate discovery, access and reuse by humans and machines.
FAIR stands for:
Findable: Your data should be findable, by you and others.
Accessible: Your data should be accessible for both humans and machines, i.e. retrievable and understandable
Interoperable: Machines and humans can interpret and use the data in different settings.
Reusable: The ultimate goal of FAIR is to advance the reuse of data. Everything you’ve done so far ultimately leads to this point, ensuring the data can be reused by others.
FAIR data summary
- Deposit your data where others can find it, keep in mind where your peers can find it, i.e. field specific repository and give it a stable unique identifier (PID).
- Make your data & metadata accessible via standard means such as http/API.
- Create metadata and explain in detail what this data is about, never assume people know!
- Deposit metadata with PID and make it available with/out data i.e. in case data itself is heavily protected.
- Include information on ownership and provenance.
- Outline what the reusers of your data are/not allowed to do, use clear license. Commonly used licenses like MIT or Creative Commons (keep in mind funders requirements).
- Specify access conditions, if authentication or authorization is required.
- Describe your data in a standardized fashion using agreed terminology and vocabulary.
- Share the data in preferred & open file formats.
- Start the process early on!
You can find more details in our training material on the basics of Research Data Management:
We are currently working on providing a centralized ELN solution for all researchers at the MDC. Please get in touch for further details.
Proprietary formats are file formats that usually can be viewed only in the software/tool which created the files. This software uses its own proprietary format to save and read the file. Only the company itself or licensees may use it. The description of the format is confidential or unpublished and the company/organization has the right to change it at any time.
In contrast, an open format is a file format that is published and free to be used by everybody.
The choice of file formats might be different dependent on which phase of the research data life cycle you are, for instance; when sharing data externally, it is more favourable to convert to an open file format, however if you are in the process of analyzing the data, you might be required to use file formats compatible with the software used.
Consider the following:
- Choose standard file formats most commonly used in your field.
- Convert data to a standard format.
- Choose a format that is required for data deposition i.e. repository requirements, archival compression.
- Consider exporting or converting from original format to a more open/preferred format but keep in mind that some data might be lost or altered during the process e.g., text formatting in documents, decimal point formatting, date and time values.
- Keep in mind there are no standard preferred file formats, and none are perfect, but consider choosing open formats that are most applicable for your use and field.
- When archiving data, combine the whole project (i.e., raw data, analysis, documentation, code and software) in one package.
- For software consider the use of containers to enable interoperability and long-term re-use.
For more information, you can visit the link under RDM tab.
GDPR defines personal data in as, any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.
According to and and the following personal data is considered ‘sensitive’ and is subject to specific processing conditions:
- personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs;
- trade-union membership;
- genetic data, biometric data processed solely to identify a human being;
- health-related data;
- data concerning a person’s sex life or sexual orientation.
Sensitive data can only be shared or published if the researcher
- have the right to publish the data
- took freely given consents for data sharing from the participants before
- got ethical approval for data publication
and by making the data anonymised and by licensing for reuse and attribution.
Otherwise, the researcher can publish the non-identifiable metadata of sensitive/personal data.