This page contains the details of the surveys and interviews that are being performed by JRA1.1 with user communities, research infrastructures, e-infrastructures, and other stakeholders of the AARC project.
Surveys
BioVel
BioVel supports researchers in the domain of ecology, biodiversity and ecosystems science.
The same requirements reported by BioVel in this document are also more in general applicable to the majority of the environmental sciences.
The use case of BioVel can be described with the following two trust relations:
- Relation between the end-users (for example the researcher) and service providers providing specialized domain specific data and analytical services.
- Relation between service providers mentioned in (a) above and the multi domain e-infrastructure providers like EGI.eu, EUDAT, PRACE, as well as commercial providers as AWS.
The first trust relation has to be secured (typically) by a username/password oriented SSO authentication and authorization mechanism. Service providers are unrelated to one another so a mechanism like that used with e-journals access has to be deployed i.e., persistence of sign-on available to multiple Service Providers for a timed period. Additionally, the persistent sign-on has to be capable of being delegated (automatically) to workflows / agents acting on the users’ behalf at the machine-to-machine level. Such workflows/agents may initiate transactions to multiple SPs in sequence.
The second trust relation between each community SP and a non community-specific service provider(s) is unique to each SP/FP pairing. There is no requirement for persistence across pairings.
The trust model, as has been initially described in the section 2.1.1 BioVel users need to access data repositories to search, access and upload data, and to access computing services to elaborate the data and then store back the results of their analysis. The BioVel community is leveraging on both internal service providers, for example for the data repositories, and on the European multi disciplinary e-infrastructures, for example to access computing capacity.
The community is accessing a number of heterogeneous service providers, single sign on (SSO) capabilities are fundamental to enable scalable workflows, together with an uniform authorization infrastructure. The workflows requires also to delegate the authorization of one user to a service to access data or perform actions on the user’s behalf.
BioVel foresee also to interact with citizen scientist, therefore some use cases may require the integration with low level of assurance credentials such as social media credentials.
AAI technologies
The community is still at the beginning of adopting federated identity/authorization solutions. Working closely with EGI and other service providers using X509 certificates as an authentication mean, the community is relying on the IGTF certification authorities federation. The overhead of obtaining and maintaining a personal certificate could though be seen as an excessive overhead by many new users. This solution is also not feasible for homeless users.
Penetration of federated identity management
Although BioVel management understands the need for federated identity management, the community has limited experience with federated AAI solutions. The research community has not already an AAI solution for the community in place, and therefore there is still need to acquire the needed knowledge.
Most of the users have credentials from their institutional IdPs, but the percentage of these federated in eduGAIN or other federations has not been assessed.
Currently the AAI federations, and AAI coordination activities, have been focusing on the end-user to Service provider direct interaction, in other words enabling simple SSO capabilities on the services. This approach is necessary but not sufficient to fulfil (probably) the BioVel requirements, which envisage a more complex relation between service providers, which need to interact to enable the workflows of the community. Delegation and uniform authorization across service providers will be fundamental bricks of the BioVel infrastructure.
The main barrier for BioVel is the lack of information and knowledge, and the community would benefit from a reliable and organized source of information, in form of online documentation, which can be consulted to take informed decision. Possibly integrate with trainings. Currently the information is very scattered and it is challenging to get the full picture that includes the IdPs documentation, the IdPs federations and the SP federations requirements and best practices.
The training and the documentation should be integrated with a support service and troubleshooting tools, to maximise the efficiency of the federations.
DARIAH
DARIAH-EU is the "Digital Research Infrastructure for the Arts and Humanities"; which legal form is an ERIC (European Research Infrastructure Consortium). The blocks composing the research infrastructure build on national initiatives.
Digital research methods are a fundamental part of the mainstream of humanities, arts and social sciences research. The digital arts and humanities are at a critical point in the transition from a specialty area to a full-fledged community with a common set of methods, sources of evidence and infrastructure. All of these are necessary for achieving academic and data driven scientific recognition. Information and data- intensive, distributed, collaborative and multidisciplinary research is now the norm in many scientific areas. The goal of DARIAH is to be an infrastructure that would ensure that the state-of-the-art of these collaborations is preserved and integrated, and that common best practices and methodological and technological standards are followed also in the field of AAI.
Currently the DARIAH community has almost 3000 active users.
Adopted Authentication & Authorisation Technologies
The DARIAH infrastructure blocks are built within national initiatives. AAI is based on SAML authentication combined with attribute aggregation. A DARIAH homeless account is available.
Personal data of users are stored in a central clustered LDAP server. Group memberships that provide access to services and Wiki spaces, as well as the user data are managed via a web-based administration portal. Attribute queries, as defined in SAML and implemented in Shibboleth, are used to aggregate information from the campus IdP and the DARIAH Attribute Authority implemented in the DARIAH IdP. A registration mechanism based on a central DARIAH SP ensures that all personal data that are are needed, but not provided by the Campus IdPs, are collected as self-asserted data from the user. The DARIAH IdP thus acts as an IdP-AA, but not as an SP, i.e. it is not a proxy.
Penetration of federated identity management
DFN-AAI/eduGAIN is feasible and being used by a number of users. However, there are lots of user accounts in the homeless IdP LDAP server for users that either have no federated IdP or with an IdP that does not release ePPN.
And there is some number of users that simply are aware that "a DARIAH account" can be their institutional account, who even do not try to log in via AAI, going for the homeless user option.
Authentication and authorization technologies
DARIAH user authentication is leveraging on the institutional IdP of their users, part of national federations such as eduGAIN federations, and the catch-all community IdP to host homeless users, who are a consistent fraction of the community user-base.
DARIAH is interested in SAML2, and OpenID/OAuth2 technologies, plus X509 credentials for legacy reasons.
It is important for the community that the authentication technologies are as much user friendly as possible. For the community is also important the support for delegation and non-web access, on top of the normal web accessible services.
Attribute release policies
DARIAH users will use either homeless IdP or one and only one campus IdP, with authorization and additional attributes provided by the VO via SAML attribute queries.
Having campus IdPs releasing ePPN is critical for DARIAH AAI. The community hasbeen working with a number of initiatives (notably CoCo) to improve the current situation. Thus more efforts should be made to scalably a) increase the number of such IdPs and b) find some way for to know whether a given IdP will release ePPN to DARIAH services (e.g. by respective entity categories of IdPs), still before the first user is affected and perhaps disappointed.
As an attempt to solve the, DARIAH decided that a) SPs must express eduPersonPrincipalName as required (via SAML metadata) and b) users' campus IdPs should honor this If not user will have to aplly for An DARIAH homeless account.
LoA management
DARIAH services are used for research and educational purposes only.
Therefore users are classified as : belonging to some research institution (access via eduGAIN qualifies for this, or an institutional e-mail address), or so-called citizen researchers.
Accounts that fall into the latter category are checked manually, i.e. mail communication to make sure that user is involved in research. Any institution that is not yet known (mail domains are stored) is checked manually as well, in order to be sorted into one of the two categories.
After this manual check there is no need for further information about differentiated LoA.
Attribute management and community managed authorization
The only user identifier used is ePPN, DARIAH connects user’s ePPNs and accounts together in the DARIAH portal.
The homeless IdP delivers via SAML attribute queries keyed by the ePPN the following attributes:
- any needed personal attributes the campus IdP did not provide, e.g. mail
- the accepted terms of use for the service in question
- authorization attributes, i.e. the names of the authorization groups the user is member of
Authorization group membership is managed manually via the administration portal in a distributed way, i.e. by the administrators of DARIAH countries, organizations, and projects.
DARIAH is therefore using community attributes to authorize access to internal services and potentially to all the services supporting the community.
EISCAT
EISCAT, the European Incoherent Scatter Scientific Association, is established to conduct research on the lower, middle and upper atmosphere and ionosphere using the incoherent scatter radar technique.
EISCAT Scientific Association is funded by six research councils. The operations of the facilities are divided in two halves, one common programme for joint activities, and the other is distributed among the associates according to funding.
The lower levels of data gathered are available only to the associate countries, and in the non-common each associates have exclusive rights for one year. In recent years, a programme for smaller organisations have been opened to operate the facilities at relative small costs. These affiliate organisations have the right to access data for one year after the date of observations.
Access control of this has so far been based on IP addresses, but with the inclusion of affiliates this becomes more and more complicated. Also, the logging of who downloads data is not done, meaning there is no way of communicating to the users any new information of problems with the data they have taken. Also, for the reporting to the owners, there is no information taken for what kind of study the data has been downloaded.
The use case here, would be a good way for authentication of who and possibly why they download data. An 'EISCAT' certificate for users, including who, why, when, how the user will handle the data. One could think of different levels of the certificate for different levels of data.
Trust models and workflows
The main use case for authentication and authorization in EISCAT is to grant access to datasets to the institutions/users who are eligible to download the data. Federated AAI could also make easier the accounting for the data usage.
The control over the access to the datasets have been done, so far, using the IP address of the client. But with the extensions of the use base, including additional affiliates, service providers will need to adopt a more sophisticated authorization mechanisms, with a user-by-user granularity.
Also, the logging of who downloads data is not done, meaning there is no way of communicate to the users any new information of problems with the data they have taken. Also, for the reporting to the owners, there is no information taken for what kind of study the data has been downloaded.
The use case here, would be a good way for authentication of who and possibly why they download data. An 'EISCAT' certificate for users, including who, why, when, how the user will handle the data. One could think of different levels of the certificate for different levels of data.
Currently EISCAT is not using AAI solutions directly integrated with the services.
Penetration of federated identity management
In general the EISCAT community lacks information about federated identity management.
Photon and Neutron community (Umbrella)
The Umbrella is an identity system designed by the European Photon and Neutron source facilities (PaNs’). It aims to make life easier and science more productive both for the facilities and their users. The Umbrella first of all provides any PaN-user (and effectively anyone interested in scientific discovery) with a unique identity, the UmbrellaID. Equipped with such an ID a user can virtually access the facilities with a single sign-on. Since the same Identity is known at each of the facilities, a user can more simply access or share data, manage administrative processes or make use of services and infrastructures provided by the PaNs’. The Umbrella is a joint project of the PaNs’ and other facilities with similar needs for an Identity Management System. The joint nature of this undertaking is the major benefit for the facilities. It permits to share the efforts for developing and maintaining the Umbrella system. Services offered by one of the facilities can be used by any of the users, which allows sharing of services within the Umbrella federation, which not only reduces the overall maintenance efforts but also leads to a richer eco-system of services for the user communities.
Future user operation at large scale facilities enforces user needs which are asking for a unique persistent user identification to have unified access to the following functionalities: a) 40% of the users do experiments at different facilities and need transfacility access, b) need for access to and management of experimental data, c) online entry mode: remote experiment access, d) access to efficient data analysis tools, e) remote file access, f) minimal administration load for users.
Umbrella is part of several FP7 projects namely: EuroFEL- ESFRI project Free Electron Lasers of Europe, PaNData-Europe & PaNData ODI- FP7 projects, CRISP – Cluster project of different ESFRI projects, CALIPSO – I3 synchrotron community, NMI3 - I3 neutron community, BioStruct-X –structural biology with synchrtron radiation 
Adopted Authentication & Authorisation Technologies
The photon and neutron community is using the Umbrella infrastructure for authorization. The technologies relevant for the community are SAML2 and X509 certificates.
Umbrella users are using credentials from their institutional IdPs and the federation in eduGAIN is an added value, alternatively users who cannot access to a federated IdP can use the catch-all IdP provided by Umbrella itself.
For the photon and neutron use cases AAI must support: easy single sign on solution, web based and non-web applications, delegation.
Attribute release policies
Not relevant.
LoA management
LoA management is relevant for the Umbrella use case.
Attribute management and community managed authorization
The community has already an unique identifier for the users. This is provided by the Umbrella infrastructure, since this has been a very important requirement for the community use case from the beginning.
Authorization based on community attributes is less relevant for the photon and neutron use case, since the authorization is entirely regulated by the service providers who enable users to access their services.
